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ABSTRACT 


The  process  of  finding  an  exact  minimization  for  a  multiple-valued  logic  (MVL) 
expression  requires  an  extensive  search  and  enormous  computation  time.  One  of  the  heuristics 
to  reduce  this  computation  time  is  the  Neighborhood  Decoupling  (ND)  Algorithm  by  Yang  and 
Wang.  This  algorithm  finds  near-optimal  solutions  for  the  given  MVL  expressions.  The  ND 
algorithm  is  an  extension  of  HAMLET  (Heuristic  Analyzer  for  Multiple-valued  Logic 
Expressions). 

The  primaiy  goal  of  this  thesis  is  to  reduce  the  computation  time  of  the  ND  algorithm 
by  using  parallel  processors.  We  developed  a  parallel  version  of  the  ND  algorithm  and  tested 
it  on  an  iPSC/2  (Intel  Parallel  Supercomputer).  The  parallel  version  of  the  ND  Algorithm 
actually  executes  in  parallel  a  portion  of  the  ND  algorithm  known  as  the  clustering  factor 
calculation.  The  number  of  nodes  needed  to  run  the  programs  is  twice  the  number  of  input 
variables  of  the  expression.  The  results  indicate  that  the  parallel  version  of  ND  algorithm  halves 
the  computation  time  compared  to  the  sequential  version. 

A  secondary  goal  of  this  thesis  is  to  initiate  the  parallelization  of  HAMLET  and  the 
study  of  parallel  computers,  i.e.  iPSC/2.  The  experience*:  we  obtained  with  iPSC/2  suggest  an 
alternative  algorithm.  The  ND  algorithm  searches  the  first  branch  of  the  search  tree  assuming 
that  the  optimum  solution  will  be  on  that  branch.  We  developed  a  Multi-branch  Concurrent  ND 
(MCND)  algorithm  which  concurrently  searches  multiple  branches,  hence  increasing  the 
probability  of  reaching  the  optimum. 
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I.  INTRODUCTION 


A.  MOTIVATION 

Very-large-scale-integration  (VLSI)  technology  has  matured  to  a  point  where 
large  logic  circuits  are  economically  realized  in  silicon.  However,  two  major 
problems,  bus  connection  and  pin  limitation,  are  bottlenecks  to  further  integration. 
Multiple-valued  logic  offers  a  solution  to  these  problems.  In  recent  years,  multiple¬ 
valued  logic  has  been  used  in  programmable  logic  arrays  (PLA)  based  on  charge- 
coupled  devices  (CCD)  or  current-mode  CMOS  [Ref.  1,  2,  3,  4].  PLA’s  provide  a 
structured  and  modular  approach  to  logic  design.  Consequently,  there  has  been 
considerable  interest  in  computer-aided  design  and  logic  ^thesis  tools  for  multiple¬ 
valued  PLA’s. 

Several  heuristic  algorithms  have  been  developed  for  the  multiple-valued  logic 
minimization  and  each  claims  some  advantages  in  specific  examples,  but  none  of 
them  is  consistently  better  than  the  others  [Ref.  5,  6,  7,  8,  9].  Heuristic  algorithms 
are  important  because  the  only  known  algorithms  guaranteed  to  find  a  minimal 
solution  require  an  enormous  search  and  are  extremely  time  consuming.  A  heuristic 
called  the  Neighborhood  Decoupling  Algorithm  (ND)  has  been  developed  at  the 
Naval  Postgraduate  School  (NPS)[Ref.  10].  This  algorithm  finds  near  minimal 
solutions  for  given  MVL  expressions.  However,  for  large  PLA’s,  computation  time 
needed  is  also  large. 
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This  thesis  shows  how  to  reduce  the  computation  time  needed  to  minimize 
multiple-valued  logic  expressions  by  using  parallel  computers.  Specifically,  a  parallel 
version  of  the  Neighborhood  Decoupling  Algorithm  is  implemented  by  using 
concurrent  C  and  is  run  on  iPSC/2  (Intel  Personal  Supercomputer). 

B.  BACKGROUND 

With  the  computer  software  developed  at  NPS  called  HAMLET  (Heuristic 
Analyzer  for  Multiple-valued  Logic  Expression  Translation),  users  can  investigate 
heuristics  of  their  own  [Ref.  12].  The  HAMLET  execution  procedure  of  these 
algorithms  is  abstracted  as  follows.  Formal  definitions  will  be  covered  in  the  next 
chapter.  Let  /  be  a  multiple-valued  function,  and  let  a  be  a  minterm  of  /. 

Input;  let  the  M  be  the  set  of  mintenns  of  a  function  /; 

Output;  the  minimized  sum  of  product,  5,  of  the  original  function; 

s  ^<t>. 

While  (A/  0)  do  { 

pick  one  minterm  a  from  A/; 
find  an  implicant  l„  which  covers  a; 

S  *- I,  uS  ; 
subtract  from  /; 

I 
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TABLE  1.1:  SUMMARY  OF  FOUR  HEURISTIC  ALGORITHMS 


Heuristic  Algorithm 

Choice  of 
Minterm 

Choice  of  Implicant 

Pomper  and  Armstrong  [Ref.5] 
(1981) 

Random 

Drives  Most  Minterms  to 

0  or  don  ’t-care 

Besslich  [Ref.6] 

(1986) 

Smallest  Weight 
(Most  Isolated) 

Drives  Most  Minterms  to 

0  or  don  't-care 

Dueck  and  Miller  [Ref. 7] 
(1988) 

Largest  IF 
(Most  Isolated) 

Largest  BCR 

Yang  and  Wang  [Ref.  10] 
(1989) 

Smallest  CF 
(Most  Isolated) 

Smallest  NRC 

TABLE  1.1  shows  four  previously  proposed  algorithms.  They  differ  from  each 
other  in  the  manner  of  picking  the  minterms  (a)  and  finding  the  implicants  (/„).  The 
Neighborhood  Decoupling  Algorithm  developed  by  Yang  and  Wang  is  a  modified 
version  of  Dueck  and  Miller’s.  All  of  these  algorithms  initiate  a  search  procedure 
for  a  and  evaluate  the  input  function  expression  /  at  minterm  a.  Next,  an  implicant 
4  is  chosen  which  covers  a.  Then,  implicant  /„  is  added  to  output  solution  set  5,  and 
4  is  subtracted  from  function  /. 

The  Pomper  and  Armstrong  heuristic  picks  a  randomly  (as  long  as  a  is  in  the 
set  of  minterms  M)  and  finds  an  4  long  as  4  covers  a)  which  drives  the  most 
minterms  to  0  or  don’t-care  when  4  subtracted  from  function  /  [Ref.  5].  In  1986, 
Besslich  presented  an  algorithm,  using  to  weight  transformations.  The  Besslich 
algorithm  picks  a  with  the  smallest  weight  (most  isolated  minterm)  and  finds  4 
which  has  a  lowest  cost  per  minterm  covered  (i.e.,  which  drives  the  most  minterms 
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to  0  or  don’t  care)[Ref.  6].  In  1988,  Dueck  and  Miller  presented  another  algorithm 
that  picks  a  from  M  \i  a  has  the  highest  isolated  factor  (IF)  and  then  finds  the  4 
which  directly  covers  a  such  that  the  break  count  reduction  (BCR)  is  maximum 
[Ref.  7],  The  ND  algorithm  by  Yang  and  Wang  is  an  improvement  to  the  Dueck 
and  Miller  algorithm  with  revised  decision  rules  for  making  selections  of  minterms 
and  implicants.  The  ND  algorithm  is  characterized  by  adopting  the  advantage  of 
each  algorithm  and  fully  utilizing  the  properties  of  the  truncated  sums.  Parallel 
Neighborhood  Decoupling  (PND)  algorithm  is  the  parallel  version  of  the  i'»D 
algorithm. 

C.  THESIS  OUTLINE 

A  summary  of  MVL  definitions  for  truncated  sum  minimization  are  introduced 
in  Chapter  II.  The  notations  and  definitions  of  Chapter  II  also  help  us  in  explaining 
the  algorithms  in  subsequent  chapters.  The  computer  system,  iPSC/2,  that  is  used 
for  developing  the  Parallel  Neighborhood  Decoupling  algorithm  is  presented  in 
Chapter  III.  Chapter  IV  and  V  discuss  the  computation  times  of  the  sequential  and 
parallel  versions  of  the  ND  algorithm. 
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II.  NOTATIONS  AND  DEFINITIONS 


The  definition  for  truncated  sum  MVL  minimization  is  given  by  Yang  and 
Wang  algorithm  [Ref.  10,  11],  and  we  use  them  here. 

A.  DEFINITIONS  FOR  TRUNCATED  SUM 
Definition  1: 

Let  X  =  {  Xi,X2,...,x„  }  be  a  set  of  n  input  variables  where  Xj  takes  on  values 
from  R  =  {  0,l,...,r-l  }.  An  n-variable  r-valued  function  /  is  a  mapping 

/  ;  R"  —  R  u  {r}.  [Ref.  9] 

Here,  r  is  a  don't-care  value;  it  can  be  chosen  freely  from  any  of  the  logic 
values,  0,l,...,r-l. 

Definition  2:  MIN 

The  MIN  [Ref.  9]  function,  is  denoted  as  /(Xi,X2)  =  x,X2,  which  evaluates  to  the 
minimum  value  of  its  arguments.  For  example,  if  R  =  {0,1,2,3},  then  /(1, 2)  =  1  and 
/(0,3)  =  0.  A  mintenn  is  an  assignment  of  values  to  x„X2,...,x„  such  that  /(x)  ^  0. 
Definition  3:  Literal 

The  literal  operation  of  a  variable  x  is  defined  as: 

r-l  a  ^  X  ^  b  (2.1) 

0  otherwise. 
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Deflnition  4:  Truncated  Sum  (TSUM) 


The  truncated  sum  (TSUM)  operation  is  defined  as; 

TSUM(Xj,X2)  =  X,  +  Xj  =  min(xi  +  Xjj  -  1).  (2.2) 

The  two  +  signs  in  this  expression  are  different.  The  leftmost  denotes  the 
TSUM  operation,  while  the  rightmost  denotes  ordinary  addition  of  two  logic  values 
which  are  viewed  as  integers.  For  example,  if  R  =  {0,1,2,3},  then  TSUM(1,2)  =  3 
and  TSUM(2,2)  =  3.  The  TSUM  obeys  the  associative  and  commutative  rules. 

These  definitions  are  inspired  by  the  fact  that  CCD  implementation  supports 
TSUM  naturally  [Ref.  9]. 

Example  1: 

For  example,  ‘xj^  is  a  literal  and  takes  value  of  3  when  1  <  x,  <  3.  However, 
function  2  *x,^  takes  a  value  of  2  based  on  the  definition  of  MIN. 

Deflnition  5:  Product  Term 

A  product  term  p  is  the  MIN  of  one  nonzero  constant  cER,  and  one  or  more 
literal  functions.  In  general,  a  product  term  is  defined  as: 


p  =  c 


i.Jn  I  ik  ^  JJC 

"  e  R;  1  i  k  ^  n. 


(2.3) 


The  constant  or  coefficient  c,  in  a  product  term,  effectively  scales  the  term.  For 
each  variable  Xj,  we  say  the  window  size  of  the  literal  is  -  i^  +  1.  We 

use  the  terms  product  term  and  implicant  interchangeably  in  this  thesis. 
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Definition  6:  Mintemi 


A  minterm  a  is  a  product  term  in  which  all  literals  have  a  window  size  of  1. 
For  example,  product  term  2  ^2  is  also  a  minterm.  We  say  the  coordinate  of  a 

is  <  aj,a2,...,a„  >.  We  denote  the  value  of  minterm  a,  g(«),  as  the  nonzero  constant 
c. 

A  product  term  p  =  c  ^^X2^ .  .  .  ^“Xn"  can  be  decomposed  into 

Iljc-i  ~  minterms.  We  say  p  generated  those  minterms.  Given  a 

product  term  p,  the  set  of  minterms  generated  from  p  is  denoted  by  MSp.  If  the 
number  of  elements  in  MSpj  is  greater  than  that  in  MSp2,  we  say  p,  covers  a  larger 
area  than  P2.  Given  a  function  /,  the  set  of  minterms  generated  from  its  product 
terms  is  denoted  by  MS^. 

Definition  7:  Sum>of-Products  Expression 

A  sum-of-products  expression  is  Pi  +  P2  +  —  +  Pn  ^or  some  integer  N,  where 
Pi  is  a  product  term.  For  example,  f  =  3  ^x|+2  °x°  °X2+3  ^x^  ^xi  is 

a  sum-of-products  expression. 

Definition  8:  Saturated  Minterms  (SAT) 

Given  a  minterm  a  generated  from  the  original  function  to  be  minimized,  if 
g(a)  =  r  -  1,  then  a  is  a  saturated  minterm.  Let  SAT  be  the  set  of  all  saturated 


minterms  of  a  function. 


Example  2: 


If  the  input  function  to  be  minimized  is  expressed  as  follows, 


^  —  3  1  V?  Ovr^  lv_^+0  2^2^^  0^2  0 


X{  ^X2+2  “Xi“  “^2+3  "Xi"  1X2^ +2  ^Xi  "X2‘‘+l  “Xi"  “xi’+l  "Xi^  "X2" 


the  MS^  can  be  represented  as  15  minterms  in  Figure  2.1.  We  mark  a  saturated 
minterm  with  a  dot  in  the  figure. 


X2 


vX1 0  1  2  3 


0 

1 

2 

3 


1^3. 

1 

3. 

3, 

3. 

1 

3. 

3. 

3. 

3. 

aj 

3. 

Figure  2.1:  Map  for  Example  2,  3  ,4;  Step  1  of  Table  3.2 


Lemma  1  Given  a  minterm  a  the  maximum  number  of  implicants  which  covers  a  is 
O(r^). 

Proof:  Consider  a  variable  (axis)  x^  of  a.  Any  implicant  (I„)  that  covers  a  may  have 
a  range  or  "window  size"  w,  such  that  1  <  w  <  r.  With  a  window  size  w,  we  may 
have  w  implicants  that  covers  a.  That  is,  for  a  given  position  a,  within  a  window, 
there  are  (a+l)  ways  to  choose  a  lower  bound  on  the  window  (0,  1,...,  a)  and  r-1- 
a  + 1  ways  to  choose  the  upper  bound,  for  a  total  of  (a+  l)(r-a)  ways  -  which  achieves 

a  maximum  of  about  —  when  a  »  -  . 
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B.  THE  PROPERTIES  OF  TRUNCATED  SUM 


There  are  two  important  properties  of  the  truncated  sum  which  are  useful  later 
in  developing  the  ND  algorithm. 

1.  Saturated  minterms  can  be  generated  by  TSUM  operation. 

The  truncated  sum  of  two  or  more  minterms  may  produce  a 
saturated  minterm.  By  definition  4,  the  truncated  sum  of  any  saturated 
minterm  and  a  minterm  identical  except  for  the  coefficient  is  a  saturated 
minterm.  In  other  words,  given  two  minterms  a,  /3  such  that  g(/3)  =  r-1, 
then  TSUM(a,/3)  =  r-1.  If  value  of  y  is  r  -  1,  i.e.,  y  is  a  saturated 
minterm  then  for  any  other  minterm  8,  y  +  S  =  y. 

As  an  example,  in  a  2-variable  4-valued  function,  three  minterms  add 
in  one  position. 

2  ^xi+1  ^Xi  ^xi  =  3  ^x^  ^xi^l  ^X2  =  3 

The  first  two  terms  form  a  saturated  minterm,  and  this  saturated 
minterm  absorbs  the  third  term  minterm. 

2.  Don’t  care  minterms  can  be  produced  by  saturated  minterm. 

In  the  minimization  procedure,  we  may  update  a  minterm  a  to  a  by 
subtracting  minterm  y  (a’  =  a  -  y),  where  y  is  the  value  of  selected 
implicant.  If  a  E  SAT,  in  a  succession  of  updates,  the  value  of  a'  may 
reach  the  value  0.  In  that  case,  the  algorithm  will  reset  that  minterm 
coordinate  to  don’t  care,  i.e.,  value  r.  In  this  way,  additional  values  can 
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be  subtracted,  perhaps  producing  a  set  of  fewer  implicants  than  the  case 
where  we  require  product  terms  to  sum  equal  to  the  maximum  value 
(rather  than  equal  to  or  greater). 

C.  DEFINITIONS  USED  IN  ND  ALGORITHM 
Dennition  9:  Direct  Neighbors 

Let  a  and  /3  be  minterms  with  coordinates  <  a„a2,...,a„  >  and  <  b„b2,...,b„  > 
respectively.  If  for  all  i  we  have  a^  =  bj  except  one  position  j  such  that  |  aj  -  bj  |  = 
1  we  say  that  a  and  are  direct  neighbors.  Given  a  minterm  a,  we  use  N(a)  to 
denote  the  set  of  its  direct  neighbors. 

Observation  1;  The  maximum  number  of  direct  neighbors  of  a  given  minterm  is  2n. 
Deflnition  10:  Directional  Neighbors 

Two  minterms  a  and  ^  are  directional  neighbors  in  the  direction  x^,  if  a;  =  bj 
for  all  i  G  [l,n]  such  that  i  ^  j  and  aj  *  bj.  When  bj  >  aj  we  say  that  P  is  in  the 
positive  direction  of  a,  while  bj  <  aj  we  say  that  p  is  in  the  negative  direction  of  a. 
Observation  2:  If  /3  is  a  direct  neighbor  of  a  then  is  a  directional  neighbor  of  a  in 
the  direction  of  Xj  for  some  i  G  [l,n]. 

Definition  11:  Connected  Minterms 

This  is  a  recursive  definition.  Given  a  minterm  a  and  a  minterm  /3,  then  we  say 
P  is  a  connected  term  of  a,  if 

1.  ^  is  a  direct  neighbor  of  a  and  either  g(/3)  <  g(a)  or  a  G  SAT. 


10 


2.  /3  is  a  directional  neighbor  of  a  in  direction  and  P's  direct  neighbor  is 
connected  to  a  and  either  g(fi)  <  g(a;)  or  a  G  SAT. 

For  example,  in  figure  2.2  mintenns  2  °X2,l  °x°  ^x^l  ^x^  ^x^l  ^Xx  ^Xz 

and  2  ^Xx  ^xi  (pointed  by  arrows)  are  connected  minterms  of  2  ^Xi  ^xi 
(the  mintenn  with  @  sign). 


Definition  12:  Connected  Mintenn  Count 

CMC,  is  the  connected  mintenn  count  of  minterm  a.  It  is  the  number  of 
minterms  that  are  connected  to  minterm  a. 

Definition  13:  Expandable  Directional  Count 

EDC,  is  the  expandable  directional  count  of  minterm  a.  It  is  the  number  of 
directions  (both  positive  and  negative  for  each  Xj)  in  which  a  has  one  or  more 
connected  minterms. 

Observation  3:  0  ^  EDC,  2n. 
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Deflnftion  14:  Clustering  Factor 

The  clustering  factor  relative  to  a  minterm  a  is  defined  as 

CF„  =  (r-l)*EDC„  +  CMC„.  (2.4) 

This  is  a  measure  of  the  weight  of  all  connected  minterms  relative  to  a.  The 
(r-1)  factor  is  the  range,  or  maximum  possible  number  of  minterms,  in  a  direction 
Xi- 
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X1 

\° 

1 

2 

3 

0 

1 

2. 

2. 

3. 

2 

2. 

2. 

3. 
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2. 

2. 

_ 
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Figure  2.3:  Map  for  Example  3,  Step  2  of  Table  3.2 


Example  3: 

In  Figure  2.1  the  minterm  i  °X2°  (the  minterm  with  @  sign)  is  one 
of  15  minterms  and  has  only  one  connected  minterm  and  so  only  one  expandable 
directional  neighbor,  i.e.  its  CMC  and  EDC  values  are  1  and  1,  correspondingly. 
Figure  2.3  shows  that  the  circled  implicant  1  °xl  °xl  was  subtracted  from 
Figure  2.1.  We  mark  a  minterm  with  a  dot  in  the  figure  because  it  was  a  saturated 
minterm  in  the  original  function  map.  (see  Definition  8  and  Figure  2.1).  The 
minterm  2  °Xi  °X2°  (the  minterm  with  @  sign)  has  no  connected 
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minterms  nor  expandable  directional  neighbors  and  CMC„  =  0,  EDC^  =  0.  The 
clustering  factors  of  all  minterms  in  Figure  2.3  are  listed  in  TABLE  2.1. 


TABLE  2.1:  CPS  FOR  ALL  MINTERMS  IN  FIGURE  2.3 


Minterm 

2  °X2 

2  '^xl  "^xl 

2  ^xl  ^xl 

3  ^xl  ^xl 

CF 

0 

10 

13 

10 

Minterm 

2  ^xl  2^1 

3  ^xl  2x| 

2  ^xl  ^xl 

2  ^xl  ^xl 

CF 

16 

13 

10 

13 

Minterm 

2  ^xl 

3  ^xl 

CF 

13 

10 

13 


111.  iPSC/2  CONCURRENT  SUPERCOMPUTER 


A.  SYSTEM  DESCRIPTION 

In  an  iPSC/2  system,  a  large  number  of  processors  or  nodes  work  concurrently 
on  parts  of  a  simple  problem.  An  iPSC/2  system  consists  of  compute  nodes  and  a 
front  end  processor,  called  the  host.  A  node  is  a  80386  processor/memory  pair.  Its 
physical  memory  is  distinct  from  that  of  the  host  and  other  nodes,  i.e.,  distributed 
memory  system.  Each  node  runs  the  NX/2  operating  system,  and  can  access  both  the 
host  file  system  and  the  iPSC/2  Concurrent  File  System.  The  host  system  runs  UNIX 
System  V  operating  system. 

A  typical  iPSC/2  application  has  a  host  program  that  runs  on  the  host  and  a 
node  program  that  runs  on  a  group  of  allocated  nodes  called  a  cube.  The  host 
program  executes  in  the  UNIX  environment  as  a  process.  It  initializes  the 
application,  provides  any  necessary  human  interface,  and  loads  the  node  program 
onto  the  nodes.  Generally,  a  node  program  performs  calculations,  exchanges 
messages  with  other  nodes,  and  sends  result  back  to  the  host. 

B.  SYSTEM  CHARACTERISTICS 

An  iPSC/2  system  consists  of  the  following  units: 

•  IBM  386  AT  Host  Server 

•  1.5  Gigabytes(OACIS)/100  Megabytes(Math  Dept.)  Harddisk  space 
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•  32  Nodes(OACIS)/8  nodes(Math  Dept.)  each  with 

-  80386  Processor 

-  Weitek  1167  (OACIS)/  80387  (Math  Dept.)  Math  Coprocessor 

-  8  MBytes  (OACIS)  /  4  MBytes  (Math  Dept.)  of  Memory 

Before  loading  the  programs  to  the  nodes,  a  cube  must  be  allocated.  The  cube 
may  consist  of  all  the  nodes  in  an  iPSC/2  system  or  a  subset  of  the  nodes,  but  the 
number  of  nodes  is  always  a  power  of  two;  that  is  a  k-ciibe  consists  of  2*^  nodes. 

C.  PARALLEL  PROGRAMMING 

The  degree  of  parallelism  is  different  from  program  to  program.  A  perfectly 
parallel  program  is  the  one  that  requires  no  intemode  communication.  In  a  perfectly 
parallel  program,  if  we  double  the  number  of  nodes,  we  halve  the  computation  time. 
But  most  programs  involve  a  mix  of  computation  and  intemode  communication. 
One  of  the  goals  of  parallel  algorithm  is  to  develop  a  communication  strategy  that 
maximizes  the  time  a  node  spends  computing  and  minimizes  the  time  it  spends 
communicating  or  waiting  for  another  node  to  complete  a  computation. 

Communication  among  processes  in  an  iPSC/2  system  is  done  with  message 
passing.  Nodes  do  not  share  physical  memory.  Messages  are  characterized  by  a 
length,  a  type  and  an  ID: 

•  The  message  length  is  the  length  of  the  structure  in  bytes.  The  message 
sending  routines  will  send  exactly  the  specified  message  length. 

•  The  message  type  defines  the  message  which  a  particular  node  is  waiting  for. 
There  are  two  types  of  messages  that  can  be  sent;  synchronous  and  asynchronous. 
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Another  way  of  communicating  between  the  nodes  is  by  global  operations.  The 
global  operations  are  high  level  constructs  for  communication  among  the  node 
processes  [See  Section  D].  In  global  operations,  the  results  are  shared  between  the 
nodes,  so  instead  of  sending  messages  from  nodes  to  the  host  and  then  calculating 
the  results,  only  the  result  of  the  global  operation  is  sent  to  the  host  by  one  of  the 
nodes.  This  may  reduce  the  message  traffic  over  the  system. 

D.  SUMMARY  OF  iPSC/2  SYSTEM  CALLS 

The  system  calls  that  are  used  in  the  ND  parallel  algorithm  and  Multi-branch 
Concurrent  algorithm  are  as  follows; 

•  Node  identification  :  setpid(),  myhost(),  mynode(),  numnodes() 

•  Clock  :  mclock() 

•  Program  loading  :  load() 

•  Message  Passing  :  csend(),  crecv(),  gisum() 

•  Concurrent  File  System  :  open(),  cwrite() 

System  call  %cXp\d{HOST_PID)  is  used  to  assign  the  process  id  of  the  host 
program.  This  id  is  needed  for  message  passing  between  the  host  and  the  nodes.  In 
our  program  HOST  PID  is  defined  in  "pardef.h”  [See  Appendix  A].  For  message 
passing  purposes,  the  host  is  considered  to  have  a  node  number,  which  is  always  one 
more  than  the  highest  numbered  node  in  the  cube  (or  equal  to  the  number  of  nodes 
in  the  cube).  For  example,  the  host’s  node  number  in  a  8-node  cube  is  8  while  0 
through  7  are  used  to  number  nodes  in  the  cube.  The  call  myhost()  returns  host’s 
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node  number.  The  system  call  mynode()  returns  the  number  of  the  node  on  which 
the  program  is  executing.  This  call  is  useful  to  make  decisions  by  using  the  node 
number  of  a  process  [See  Chapter  V,  Section  Bj.The  system  call  numnodes()  returns 
the  number  of  the  nodes  in  the  allocated  cube.  This  call  especially  useful  to  make 
the  programs  general  purpose.  By  using  numnodes()  the  user  does  not  have  to  enter 
the  cube  size  to  the  program. 

The  mclock()  routine  provides  a  simple  mechanism  to  measure  the  time 
intervals.  The  system  call  mclock()  returns  the  value  of  a  counter  that  reflects 
relative  time  in  milliseconds.  We  obtain  an  initial  time  value  and  interpret  stop  time 
to  this  initial  value.  We  use  mclock()  only  in  the  MCND  algorithm. 

The  system  call  \oad(filename,  node,  id)  is  used  for  loading  the  processes 
{filename)  to  the  nodes.  As  soon  as  a  node  is  loaded,  it  starts  the  execution  of  the 
program.  The  variable  node  is  an  integer  which  defines  the  node  number  on  which 
the  process  will  be  loaded.  When  node  is  set  to  -1  then  the  load()  instruction 
broadcasts  to  all  nodes.  The  variable  id  is  the  process  id  of  the  program  that  will  be 
loaded.  Each  node  can  be  loaded  with  upto  20  processes,  but  in  our  programs  we 
only  used  one  process  per  node  so  the  only  process  id  is  0. 

The  system  calls  csend(ry;7e,  buf,  len,  node,  pid)  and  cxeey {type,  buf  len)  are 
synchronous  message  passing  instructions.  The  iPSC/2  provides  the  asynchronous 
message  passing  also,  but  because  the  nodes  start  execution  right  after  they  are 
loaded,  we  need  to  block  the  processes  until  the  message  that  contains  the  Working 
Expression  Set  and  Coordinates  of  the  minterm  is  received.  With  synchronous 
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message  passing  the  node  resumes  execution  only  after  the  message  is  received.  An 
asynchronous  message  passing  could  be  used,  but  then  another  instruction  msgwait() 
is  needed  to  block  the  process  to  wait  for  the  message.  The  variable  type  assigns  the 
message  id  which  that  instruction  is  sending  or  waiting  for.  The  variables  buf  and  len 
define  the  address  and  size  of  the  message  buffer.  The  variable  node  has  the  same 
effect  as  in  load(),i.e.  it  defines  the  node  which  the  message  will  be  sent.  If  it  is  -1, 
it  broadcasts  the  message  to  all  the  nodes.  Lastly,  pid  specifies  the  process  id  which 
is  to  receive  the  message.  The  system  call  gisum(x,  n,  work)  is  one  of  the  global 
operations.  These  operations  accumulate  data  fi'om  the  entire  allocated  cube,  x  is 
the  pointer  to  the  input  vector  to  be  used  in  the  operation,  after  the  completion  of 
the  operation  it  contains  the  final  result.  The  variable  n  is  the  length  of  the  vector 
and  work  is  a  working  array  for  the  summation.  All  the  nodes  must  call  the  same 
routine  (with  their  own  x)  for  a  specific  operation,  in  our  case,  it  is  an  integer 
summation  and  the  final  result  is  distributed  to  all  nodes.  The  system  call  gisum() 
calculates  the  sum  of  each  integer  component  of  x  across  all  nodes.  The  result  is 
returned  in  x  to  every  node. 

The  system  call  optn(filename###,0_CREAT\0_RDWF\0_APPEND,0644) 
opens  a  file  and  returns  a  file  number  that  can  be  used  later.  The  three  "#"  symbols 
after  the  file  name  are  replaced  by  the  node  number  which  opens  the  file. 
cwriteifile  no,  buf,  strlen(buf))  writes  the  data  which  is  in  the  buffer  to  the  file  with 
assigned  filejio.  To  send  formatted  streams  to  the  buffer,  we  used  sprintf() 
instruction.  This  buffer  is  then  written  to  the  file. 
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rv.  PARALLEL  NEIGHBORHOOD  DECOUPLING  ALGORITHM 


The  parallel  neighborhood  decoupling  algorithm  is  a  parallel  version  of  the 
ND  algorithm  [Ref.  10].  The  Parallel  ND  (PND)  algorithm  has  two  computational 
phases;  minterm  selection  and  implicant  selection.  Minterm  selection  is  based  on  the 
clustering  factor  computation  [See  Chapter  II  Section  Cj.  Implicant  selection  is 
based  on  Neighborhood  Relative  Count  (NRC)  computation.  From  all  implicants 
which  cover  the  selected  minterm,  the  implicant  that  is  the  most  loosely  coupled 
(isolated)  with  its  neighbors  is  chosen.  This  decoupling  process  is  based  on  the  fact 
that  if  we  choose  the  most  isolated  implicant  then  we  may  minimize  the  negative 
impact  for  future  minterm  selections  as  well  as  implicant  selections. 

In  the  ND  algorithm,  before  selecting  another  most  isolated  minterm,  the 
implicant  that  is  selected  should  be  subtracted  from  the  expression.  The  update  of 
the  expression  must  be  completed  before  the  minterm  selection  of  the  next 
computation  phase.  We  searched  for  a  part  of  the  algorithm  that  we  can  minimize 
the  communication  and  maximize  the  time  spent  on  computation  and  found  that  the 
CF  computation  was  a  good  candidate  for  parallelization.  The  other  parts  of  the 
algorithm,  such  as  Neighborhood  Relative  Count,  are  not  so  amenable  to 
parallelization,  because  they  need  much  communication  time  compared  to  the 
computation  which  will  be  performed  by  a  node.  For  example,  in  the  NRC 
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computation  [See  section  B],  much  time  is  spent  executing  conditional  branch 
instructions.  Even  though,  the  NRC  algorithm  is  a  large  static  code,  the  dynamic 
code  is  not  large  enough,  so  much  communication  time  that  will  be  spent  sending 
the  data  to  the  node  where  NRC  procedure  executes  and  this  is  not  feasible.  The 
main  idea  to  parallelize  the  CF  computation  is  to  perform  the  EDC  and  CMC 
[Definitions  12  &  13]  computations  in  each  direction  for  a  variable  of  a  minterm. 
The  number  of  nodes  that  is  needed  depends  on  the  number  of  the  variables.  For 
each  variable,  we  need  two  nodes,  one  for  negative  side  of  a  minterm  at  the 
corresponding  coordinate  and  the  other  for  the  positive  side.  The  EDC’s  and  CMC’s 
that  are  calculated  are  summed  using  a  global  sum  operation,  where  node  #0  sends 
the  total  EDC  and  CMC  values  to  the  host.  The  host  then  asks  for  another 
minterm’s  CF  value. 

In  the  sequential  version  of  the  Yang  and  Wang  algorithm,  the  main  program 
asks  for  the  coordinates  of  a  minterm  which  has  the  smallest  clustering  factor.  The 
sequential  clustering  factor  procedure  computes  the  EDC  and  CMC  values  for  the 
negative  direction  of  the  first  coordinate  and  then  computes  those  values  for  the 
positive  direction  of  the  same  coordinate.  Then,  the  EDC  and  CMC  values  of  the 
second  coordinate  are  computed  for  the  negative  and  positive  directions.  This 
procedure  is  applied  to  all  consecutive  coordinates,i.e.  variables.  The  results  are 
summed  up  and  CF  is  calculated.  When  the  number  of  the  variables  is  increased, 
we  have  more  coordinates  to  compute.  This  computation  scheme  is  depicted  in 
Figure  4.1. 
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Figure  4.1:  Flowchart  of  Sequential  ND  Algorithm 

The  parallel  version  of  the  ND  algorithm  has  a  different  approach  to  the 
clustering  factor  computation.  We  still  need  the  EDC  and  CMC  values  for  the 
negative  and  the  positive  directions  of  the  coordinates  of  the  selected  minterm.  The 
parallel  version  loads  the  codes  needed  to  calculate  the  negative  and  positive 
directions  of  a  coordinate  to  the  nodes.  For  a  3  input  variable  expression,  6  nodes 
are  required.  The  allocation  of  the  nodes  is  shown  in  the  Figure  4.2.  The  dummy 
nodes  in  Figure  4.2  are  explained  at  the  end  of  section  A. 

The  main  benefit  from  the  parallel  algorithm  comes  when  we  increase  the 
number  of  the  variables.  The  time  needed  in  the  sequential  computation  is 
proportional  to  the  number  of  variables.  Figure  4.1  shows  that  when  we  have  more 
variables,  the  algorithm  will  grow  vertically  requiring  spend  more  computation  time. 
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Figure  4.2  shows  that  when  we  have  more  input  variables,  the  algorithm  can  expand 
horizontally  (until  we  run  out  of  nodes).  Thus,  the  parallel  algorithm  will  not  spend 
as  much  time  as  the  sequential  algorithm  to  compute  a  clustering  factor. 


Figure  4.2:  Flowchart  of  Parallel  ND  Algorithm 

The  ND  Algorithm  is  listed  below.  In  this  algorithm,  /  denotes  the  function 
to  be  minimized. 

y*  **m********************************************l^tl,01^*:^*ilf****^**** 

MS^  :  Original  Expression  Set 
WS  :  Working  Expression  Set 
SS  :  Solution  Set 
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{ 

SS  *-  0;  /*  SS  =  Solution  Set  */ 

WS  =  MSy  =  {  a  |a  is  generated  by  the  function  /;  if  a  E  SAT  then  mark  its 
coordinate  }. 

While  WS  0  do  { 

1.  Use  algorithm  CF_PAR  to  select  a  minterm  a  from  the  WS. 

2.  Use  algorithm  N  to  select  an  implicant  that  covers  a. 

3.  SS  *-  SS  u  I„. 

4.  V/3  e  do  { 

compute  g(fi)  —  g(/3)  -  g(a). 
subtracted  from  WS. 

if  ^  is  originally  marked  and  g(/3)  =  0  then  gO)  r. 

/*  don’t  care  terms  */ 

} 

} 

} 

The  search  space  of  the  algorithm  can  be  represented  as  a  tree  where  each 
node  represents  the  current  working  expression  set  and  each  edge  corresponds  to 
an  implicant  selection.  The  root  of  the  search  tree  is  the  original  expression  set  or 
MSf. 
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A.  ALGORITHM  CF_PAR:  MINTERM  SELECTION 

The  ND  Parallel  algorithm  computes  the  clustering  factor  for  all  minterms  in 
a  working  expression  set.  The  number  of  nodes  that  is  actually  needed  is  (2  * 
number  of  variables  in  the  expression),  and  the  system  allows  only  power  of  2 
number  of  nodes  to  be  allocated.  For  example,  even  though  we  need  only  10  nodes 
for  5  input  variables,  we  have  to  allocate  16  nodes. 

The  host  program  ND_PAR()  loads  the  first  half  of  the  allocated  nodes  with 
the  program  which  computes  the  negative  direction  of  a  coordinate  (cf_left)  and  the 
second  half  with  the  program  for  the  positive  direction  (cf  right).  For  each  Working 
Expression  Set,  the  most  isolated  minterm’s  coordinates  are  requested.  The  host 
program  loads  the  current  working  set  onto  an  message  array.  This  array  is  defined 
by  "pardef.h"  and  consists  of  the  expression  and  the  coordinates  of  the  selected 
mintenn.  A  minterm  is  selected  from  the  working  set  and  its  coordinates  are 
assigned  to  the  message  array.  The  message  is  broadcasted  to  the  nodes  by  using 
synchronous  message  passing.  The  host  program  then  blocks  on  a  receive  instruction 
waiting  for  the  results. 

After  the  nodes  are  loaded,  the  node  programs  start  execution.  Nodes  block 
on  a  receive  instruction  and  wait  for  the  message  from  the  host.  After  they  receive 
the  message  from  the  host,  they  compute  their  assigned  coordinates  using  the  system 
call  mynode().  For  example,  for  a  4  input  variable  expression,  8  nodes  are  allocated. 
The  nodes  from  0  to  3  compute  the  negative  direction  of  the  coordinates  (XI 
through  X4)  while  nodes  4  to  7  compute  in  the  positive  direction  for  the 
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coordinates.  If  the  number  of  nodes  needed  is  less  than  the  allocated  nodes,  then 
the  extra  nodes  become  dummy  nodes.  All  nodes  checks  their  allocated  coordinates, 
and  if  the  coordinate  is  larger  than  the  number  of  variables  in  the  expression,  they 
return  0  for  both  EDC  and  CMC  values.  All  EDC  and  CMC  values  computed  on 
the  nodes  are  summed  by  using  global  summation  gisum().  The  result  is  available 
to  all  the  nodes.  Node  #0  has  a  special  assignment  of  sending  the  result  to  the  host. 
The  host  calculates  the  CF  using  EDC  and  CMC  that  are  reported  by  node  #0.  The 
host  then  selects  another  minterm  from  the  working  expression  set.  The  above 
algorithm  is  applied  recursively  until  the  CF  values  of  all  minterms  in  the  working 
expression  set  are  computed. 

The  computation  of  CF  is  as  follows: 

)^m^**mm********m**if^^*m**tf*.***^***.*********m********* ************* 

WS;  Working  Expression  Set 

Xi'.  Coordinates  of  a  minterm  a 

I*  **************************************************************** 

Host  Program 


get  the  coordinates  of  the  minimum  CF  minterm 


message_to_node  *-  WS 
V  a  e  WS  do  { 

message_to_node  ♦-  Xj 
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send  (message_to_node  to  all  nodes) 
recv  (message_£rom_node  from  node  0) 

CF  *-  message_from_node.dea  *  (radix  -1)  +  message_from_node.ea 
if  (Cur_CF  >  CF)  { 

Cur_CF  CF 
Savecoord  Xj 

} 

} 

return  the  coordinates  of  the  minimum  CF  minterm 

Node  Program  (CF_left) 

EDC  0 
CMC  —  0 

recv  (message_from_node  from  host) 

variable_number  mynode()  /*  assign  node  number  as  coordinate  *! 
if  (variable_number  <  message_to_node.nvar)  {  /*  if  the  node  number  is  bigger  than 

the  number  of  variables  do  not  compute  */ 
Compute  EDC  and  CMC  to  the  left  of  the  coordinate 

} 

globalsum  (Add  EDC  and  CMC  values  for  all  nodes) 
if  (mynode  =  0)  { 
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send  (message_from_node  to  host)  /*  Total  EDC  and  CMC  values  From  all 


nodes  V 


} 

Node  Program  (CF_right) 

EDC  —  0 
CMC  —  0 

recv  (message_to_node  from  host) 

variable_numbcr  mynode()  -  numnodes/2  /*  corrects  and  assigns  the  coordinate 

V 

if  (input_variable  <  message_to_node.nvar)  { 

Compute  EDC  and  CMC  to  the  right  of  the  coordinate 

} 

globalsum  (Add  EDC  and  CMC  values  for  all  nodes) 


B.  ALGORITHM  N:  NEIGHBORHOOD  RELATIVE  COUNT 

The  purpose  of  Algorithm  N  is  to  choose  the  most  "isolated"  implicant  (I„)  and 
update  the  working  set  WS.  It  computes  the  neighborhood  relative  count  (NRC)  for 
all  implicants  that  cover  the  mintenn  a.  The  implicant  with  the  smallest  NRC  is 
chosen.  In  other  words,  NRC  is  a  measure  if  the  coupling  strength  of  an  implicant 
with  its  neighbors.  To  select  an  implicant  (which  is  equivalent  to  breaking  the 
coupling  between  that  implicant  with  its  neighbors),  the  candidate  implicant  should 
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have  the  smallest  coupling  strength  with  its  neighbors.  Therefore,  the  ND  algorithm 
tends  to  choose  the  most  "isolated"  implicant.  If  there  is  a  tie  in  selecting  the  the 
ND  algorithm  chooses  the  one  which  covers  the  largest  area.  The  computation  of 
NRC  for  a  given  implicant  is  described  as  follows; 

1.  Initialize  the  NRC  to  zero. 

2.  Check  all  neighboring  minterms  of  the  implicant  and  increment  or 
decrement  its  NRC  according  to  the  following  (intuitively  stated)  rule,  which  is,  if 
the  coupling  strength  between  covered  and  uncovered  area  is  weak  (good  for  further 
decoupling),  Algorithm  N  decreases  NRC,  otherwise  increases  NRC. 

a:  the  chosen  minterm  from  algorithm  CF_PAR 

M;  the  set  of  minterms  which  was  covered  (generated)  by  the  chosen  implicant 

(U- 

N(/3):  the  set  of  direct  neighbors  of  minterm  /3. 

«*«4i**********«***««**«*4r*«*«****«***4>**«>**4i****«**«***4i*****4>**4>4>* 

{ 

NRC  —  0; 

V  /3  €  M  and  p  ^  a  Ao  { 

if(gO)  -  g(a)  :s  0)  then  NRC  NRC  -  2; 

} 

V/3  e  M  and  Vy  €  N(^)  do  { 

if(y  c  M  and  y  ^  0  and  (y  (  SAT  or  ^  f  SAT))  then  { 
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^  W)  -  g(a)  >  g(Y))  then  { 

if  (y  G  SAT)  then  NRC  NRC  -  1; 

else  NRC  •«-  NRC  +  2; 

} 

if  W)  -  g(a)  <  g(Y))  then  { 

if  W)  =  g(Y))  then  NRC  —  NRC  +  2; 
if  (y  e  SAT  and  g(Y)  -  g(fi)  <  0)  then 
NRC  <-  NRC  +  2; 

else  { 

if  (g(^)  >  g(a)  and  g(/3)  9^  g(Y))  then  { 
if  0?  G  SAT)  then  NRC  NRC  -  1; 
else  NRC  *-  NRC  +  2; 

}  /*  end  if  */ 

}  /*  end  else  */ 

}  /*  end  if  */ 

if  (g(^)  *  g(a)  =  g(Y))  then  { 
if  (g(Y)  >  0  or  pe  SAT^  then 
NRC  NRC  -  1; 
else  NRC  <•-  NRC  -  2; 

} 

}/*  end  if  */ 

} 
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if  (M  =  {a})  then  { 

if  (a  e  SAT)  then  NRC  ♦-  2; 

else  if  (NRC  <  0)  then  NRC  ♦-  1; 

} 

else  NRC  —  NRC  +  2;} 


Figure  4  J:  Third  Step  of  minimization  for  the  function  in  Example  4 


Example  4: 


The  input  function  to  be  minimized  is  expressed  as: 
f  *  3  ^xl  ^xl*2  °x°  ^xl  ^xi+2  ^Xi  ^xi+1  °Xi  °xi+l  ^Xi  ^xi 


The  working  set,  WS,  is  initialized  to  MS^  and  is  represented  in  Figure  2.1.  The 
clustering  factors  of  all  minterms  in  WS  are  calculated,  and  the  first  minimum  CF 
is  selected  as  a;  in  this  case  it  is  l  ^Xi  °xi  .  The  ND  algorithm  computes  the 
NRC  for  each  implicant  I  which  covers  a  using  Algorithm  N.  For  the  WS  in  Figure 
2.1,  impUcant  l  "xa  is  selected.  This  implicant  is  added  to  the  solution  set, 
SS,  and  subtracted  firom  working  set,  WS.  The  result  can  be  seen  in  Figure  2.3.  The 
minterm  and  implicant  2  °x°  °Xa°  is  selected,  (see  Example  3).  This  implicant  is 
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also  added  to  the  solution  set  and  subtracted  from  working  set.  Because  this 
implicant  is  a  SAT,  it  is  shown  as  don’t  care  "4."  in  the  working  set  Figure  4.3  shows 
a  recent  WS.  The  clustering  factor  computations  that  is  performed  by  different 
nodes  are  shown  in  TABLE  4.1.  The  minimum  CF  is  found  as  10  and  it  belongs  to 
minterm  2  ^X2  The  implicant  selected  is  3  ^xl  ^xl  with  an  NRC  (-16). 

Finally,  the  working  set  should  contain  value  0  (empty  square)  or  A.{don't  care)  as 
shown  in  Figure  4.4. 


\X1  ^  ^ 

0  1  2  3 


4. 

4. 

4. 

4. 

4. 

4. 

4. 

4. 

4. 

4. 

Figure  4.4:  Final  Working  Set 


The  final  minimized  result  which  is  kept  in  solution  set  (SS),  g,  is  expressed  as: 

g  =  1  °Xi  °xi  +  2  °Xi  °X2  +  3  ^xi 
As  can  be  seen  by  comparing  the  original  function  and  the  function  resulting 
from  the  ND  algorithm  we  have  a  50%  reduction. 
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TABLE  4.1:  CMC  AND  EDC  COMPUTATIONS  FOR  FIGURE  4.3 


Node  No 

0 

1 

2 

3 

CF 

A 

XI  left 

XI  right 

X2  left 

X2  right 

dea* 

dea 

ea 

dea 

ea 

dea 

ea 

dea 

ea 

(r-1) 

-1-  ea 

2  ^Xi 

0 

0 

1 

2 

0 

0 

1 

2 

10 

2 

1 

1 

1 

1 

0 

0 

1 

2 

13 

3  ^xl 

1 

2 

0 

0 

0 

0 

n 

2 

10 

0 

0 

1 

2 

1 

1 

1 

1 

13 

BBS! 

1 

1 

1 

1 

1 

1 

1 

1 

16 

1 

2 

0 

0 

1 

1 

13 

0 

0 

1 

2 

1 

2 

0 

0 

10 

2  ^xl 

X2 

1 

1 

1 

1 

1 

2 

0 

0 

13 

3 

1 

2 

0 

0 

1 

2 

0 

0 

10 

C.  COMPARISON  RESULTS 

In  this  thesis  all  testing  results  were  obtained  by  running  the  test  function  on 
the  iPSC/2  computers  that  were  available  to  us  at  NPS  Math  Department  and 
Oregon  Advanced  Computer  Information  Systems  (OACIS),  Oregon.  Both 
computers  are  the  same  except  that  the  iPSC/2  at  NPS  has  8  nodes  with  80387 
Math-coprocessor  and  iPSC/2  at  OACIS  has  32  nodes  with  Weitek  1137  Math- 
coprocessor.  The  choice  of  which  computer  to  use  depended  the  size  of  the 
functions  we  chose  to  minimize.  For  example,  the  iPSC/2  at  OACIS  was  used  for 
computing  five-variable  four-valued  functions  which  needs  10  nodes,  while  the 
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iPSC/2  at  NPS  was  used  for  smaller  functions.  For  test  purposes,  the  following 
functions  are  generated  by  using  HAMLET’s  test  generator; 

1.  Two-variable  four-valued  with  5  to  50  input  product  terms. 

2.  Three-variable  four-valued  with  5  to  70  input  product  terms. 

3.  Four- variable  four-valued  with  5  to  35  input  product  terms. 

4.  Five-variable  four-valued  with  5  to  35  input  product  terms. 

All  input  functions  were  generated  randomly.  Notice  that  for  three-variable 
four-valued  expressions  the  number  of  test  functions  were  more  than  the  others.  For 
a  two-variable  four-valued  function  after  30  input  product  terms,  it  tends  to  saturate 
and  minimizes  to  one  implicant.  The  three-variable  four-valued  test  functions  are 
used  to  see  if  the  computation  time  is  still  exponentially  increasing  while  the  number 
of  input  terms  are  increased.  For  each  case  the  same  expression  set  is  used  to  be 
minimized  by  both  the  sequential  and  parallel  version.  The  minimization  results  are 
the  same  in  all  cases. 

For  the  testing  of  2  variable  4  valued  expressions,  we  used  10  different 
expression  stes  of  30  expressions  each  consisting  of  5  to  50  terms.  Figure  5.1  shows 
that  the  parallel  algorithm  is  faster  than  the  sequential  one.  It  can  be  seen  that  when 
the  number  of  terms  in  the  expression  is  increased,  the  computation  time  also 
increased,  but  the  rate  of  increase  is  less  for  parallel  algorithm.  This  is  especially 
true  after  saturation,  which  occurs  at  about  30  terms.  In  this  case,  the  parallel 
computation  time  drops  dramatically  and  the  rate  of  climb  decreases.  The  main 
reason  for  this  decrease  is  that  minterm  selection  is  done  only  for  the  first  working 
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set  (WS)  because  all  the  minterms  are  saturated  and  one  implicant  covers  the  whole 
working  set.  But  even  for  computing  the  first  working  set,  all  the  terms  in  the 
expression  should  be  added  according  to  their  coordinates.  The  sequential  program 
does  this  sequentially,  and  while  we  increase  the  number  of  implicants  in  the 
expression,  computation  time  also  increases.  The  parallel  algorithm  works  the  same 
way,  but  the  computation  is  divided  between  the  nodes  so  the  rate  of  increase  is  not 
high. 
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Figure  5.1:  Comparison  between  Sequential  and  Parallel  Algorithms  for 
2  variable  4  valued  expressions 

For  3  variable  4  valued  expressions,  we  minimized  expressions  which  consists 
of  5  to  70  terms.  Again,  each  set  has  30  different  expression  in  it.  Figure  5.2  shows 
that  after  45  terms,  computation  time  levels  out  with  the  parallel  program 
proceeding  at  twice  the  speed  the  sequential  program.  Comparing  Figure  5.1  and 
Figure  5.2  shows  similarity  between  the  two  graphs.  We  expect  that  if  we  continue 
to  increase  the  number  of  terms  in  the,  expressions  we  will  obtain  a  similar  curve 
shape  for  3  variable  4  valued  expressions. 
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Figure  5.2:  Comparison  between  Sequential  and  Parallel  Algorithms  for 
3  variable  4  valued  expressions 


For  4  and  5  variable  4  valued  expressions,  we  used  expressions  consisting  of 
5  to  35  terms.  As  can  be  seen  from  the  vertical  axes  of  Figure  5.3  and  Figure  5.4, 
there  is  a  large  difference  between  the  computation  times  (which  is  more  for  5 
variable  expressions).  It  is  easy  to  notice  that  these  curves  are  also  similar  to  the 
beginning  of  the  curves  for  2  and  3  variable  expressions.  Saturation  needs  a  large 
number  of  terms  for  4  and  5  variables.  A  5  variable  expression  has  a  5  dimensional 
space,  and  the  number  of  terms  we  used  was  not  enough  to  obtain  significant 
saturation  because  the  terms  are  randomly  spaced.  We  expect  the  curves  for  4  and 
5  variables  to  be  similar  to  Figure  5.1  if  we  increase  the  number  of  terms  in  the 
expressions. 
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Figure  53:  Comparison  between  Sequential  and  Parallel  Algorithms  for 
4  variable  4  valued  expressions 


»  vARiAai.e» 


Figure  5.4:  Comparison  between  Sequential  and  Parallel  Algorithms  for 
5  variable  4  valued  expressions 
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V.  EXPERIENCES  AND  FUTURE  DEVELOPMENTS 


The  experiences  with  iPSC/2  and  an  improved  algorithm  are  reported  in  this 
chapter. 

A.  EXPERIENCES  WITH  iPSC/2 

We  encountered  a  number  of  problems  in  using  the  system  or  adapting  the 
sequential  programs  to  a  parallel  system.  For  example,  some  of  the  instructions  in 
the  HAMLET  are  system  specific  and  required  a  change  to  iPSC/2. 

1.  Size  of  the  Messages 

One  of  the  problems  encountered  while  running  the  ND  parallel 
algorithm  on  the  iPSC/2  was  the  size  of  the  messages  to  be  used.  Pointers  in  the  C 
language  are  by  indirect  addressing  to  a  shared  memory  location.  The  iPSC/2  system 
is  a  distributed  memory  system,  and  we  cannot  use  pointers  when  we  need  to  pass 
expressions  and  coordinates  for  the  minterms  to  the  nodes.  Instead,  we  must  use 
arrays  which  should  be  predefined  at  the  compile  time.  The  size  of  the  arrays  are 
defined  in  "pardef.h"  file[see  Appendix  A].  The  array  sizes  are  very  important 
because  they  define  the  size  of  the  messages  that  will  be  sent  from  host  to  the 
nodes.  We  want  to  keep  the  array  sizes  as  small  as  possible  to  minimize  the 
communication  time.  The  structure  in  the  program  requires  the  number  of  variables 
and  the  number  of  terms  in  the  expression  to  be  defined  in  "pardef.h"  file.  The  size 
of  the  terms  should  be  twice  the  actual  number  of  terms  because,  while  the  program 


37 


is  processing  the  minimization,  the  implicants  that  are  found  are  added  to  the 
working  set  with  a  negative  coefficient  for  subtraction  purposes.  Assuming  that  there 
will  be  no  minimization  in  the  worst  case,  another  set  of  terms  which  has  the  same 
size  as  the  original  set  will  be  added  to  the  working  set.  As  in  traditional  C, 
whenever  there  is  an  alteration  in  the  pardef.h  file,  the  program  should  be 
recompiled  to  realize  the  changes.  This  procedure  did  not  allow  us  to  use  script 
programming  and  we  had  to  run  all  the  tests  one  by  one. 

2.  Debugging 

There  are  two  ways  to  debug  a  program:  application  checkpointing, 
system  debugger.  Application  checkpointing  is  to  place  print  instructions  at  different 
points  within  the  source  code  and  monitor  the  values  of  the  variables  and  the  flow 
of  the  program.  For  iPSC/2  this  is  infeasible.  All  the  nodes  and  the  host  use  the 
screen  as  standard  output  device.  All  the  nodes  are  running  concurrent  processes, 
sometimes  nodes  send  print  messages  to  the  screen  at  the  same  time  and  the  screen 
is  unreadable.  We  use  this  debugging  method  only  for  the  host  programs. 

For  debugging  purposes,  iPSC/2  ofiers  a  debugger  which  is  called  as 
decon  "Concurrent  Debugger".  This  debugger  allows  users  to  trace  the  host  and 
node  codes.  Decon  was  found  to  be  very  useful.  However,  there  are  two  flaws  that 
we  encountered  in  using  the  decon. 

The  debugger  is  not  complete.  Some  commands  are  not  implemented  yet. 
For  example,  while  tracing  the  program  it  is  not  possible  to  step  through  more  than 
one  line.  This  incapability  of  skipping  multiple  lines  causes  inconvenience  when 
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loops  are  encountered.  Another  problem  is  the  debugger  does  not  display  the  values 
of  the  external  variables  which  are  widely  used  in  ND  algorithm.  For  example,  the 
working  expression  set  and  original  expression  set  are  external  variables  and  used 
by  different  procedures. 

B.  AN  IMPROVED  ALGORITHM 

The  development  of  PND  algorithm  helped  us  to  understand  the  structure  of 
HAMLET  and  to  have  experience  on  iPSC/2.  This  work  lead  us  to  developed 
another  method,  called  Multi-branch  Concurrent  ND  algorithm  (MCND)  as  an 
alternative  to  the  recursive  sequential  algorithm. 

Searching  for  an  exact  solution  by  using  a  recursh  e  algorithm  needs  a  large 
amount  of  computation  time.  A  recursive  algorithm  keeps  track  of  the  minterms 
which  have  equal  minimum  clustering  factors.  The  program  saves  the  coordinates 
of  those  minterms  and  compute  other  branches  to  find  a  better  solution. 

There  are  two  flaws  in  the  parallel  version  of  the  ND  algorithm;  it  searches 
only  one  branch  [See  Chapter  IV  Section  A]  and  uses  excessive  amount  of  message 
passing.  The  primary  purpose  of  MCND  algorithm  is  to  overcome  these  problems. 
The  MCND  algorithm  searches  every  branch  of  the  search  tree,  and  it  only  needs 
a  message  passing  for  sending  original  expression  at  the  beginning  of  the  program. 
All  nodes  are  independent  of  each  other  and  make  decisions  according  to  the  rules 
in  Chapter  VI  Section  B.  This  may  provide  the  fastest  computation,  because  no 
synchronization  between  nodes  are  needed. 
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1.  The  Multi-Branch  Concurrent  ND  Algorithm 


Exact  optimal  solution  searches  the  entire  tree  space.  On  the  other  hand, 
ND  searches  only  one  path  leading  to  a  leaf  in  the  tree  space.  The  MCND  lies 
between  ND  and  exact  solution  in  its  operation.  Its  effectiveness  is  limited  only  by 
the  number  of  computational  nodes  available.  MCND  does  not  guarantee  an  exact 
optimal  solution.  On  the  other  hand,  MCND  is  not  ND  nor  PND.  It  is  an  extension 
of  PND,  since  it  relaxes  the  search  tree. 

The  MCND  algorithm  is  loaded  to  all  nodes  by  host.  After  the  node 
programs  are  loaded,  all  processes  start  to  execute  and  then  block  on  a  synchronous 
receive  instruction,  waiting  for  the  host  to  send  the  message  which  contains  the 
original  expression  set.  The  host  program  (which  is  a  part  of  the  HAMLET) 
converts  into  arrays  the  pointers  which  point  to  the  expressions  to  be  minimized. 
The  message  array  contains  the  expression  and  the  flags  for  printing  the  implicants 
and  maps  by  the  nodes.  The  host  program  broadcasts  this  message  to  all  nodes  and 
blocks  itself  waiting  for  the  results  from  the  nodes. 

The  nodes  which  are  blocked  on  a  receive  instruction  continue  the 
program  after  the  message  containing  the  original  expression  set  is  received.  The 
original  expression  set  and  a  working  expression  set  are  created  from  the 
information  in  the  message  array.  The  algorithm  that  nodes  execute  is  the  same 
algorithm  as  the  algorithm  in  Chapter  IV,  but  the  CF  PAR  algorithm  is  replaced 
with  Multi-CF  (MCF)  algorithm. 
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The  MCF  algorithm  groups  the  nodes.  At  the  beginning,  all  nodes  in  a 
cube  are  in  one  group  with  group  size  numnodes().  The  clustering  factors  are 
computed  for  each  minterm  and  the  coordinates  of  the  minterm  which  has  the 
smallest  CF  is  saved.  If  the  program  encounters  a  tie,  then  the  first  and  the  last 
minterm ’s  coordinates  are  saved,  i.e.  even  if  there  are  more  than  two  minterms  only 
the  first  and  the  last  one’s  branches  will  be  searched.  The  first  and  last  minterms  are 
selected  instead  of  intermediate  ones,  because  when  two  minterms  are  far  apart  in 
coordinate  or  evaluation  sequence,  they  may  have  less  chance  to  share  the  same 
destiny.  The  reason  for  choosing  only  two  branches  of  the  tree  is  the  expectation  of 
further  branching  on  the  branches  and  the  limited  number  of  nodes  available, 
because  each  node  will  follow  another  branch  of  the  tree. 

Each  node  knows  its  node  number  by  using  system  call  mynode().  If  there 
is  only  one  minterm  with  the  smallest  CF,  then  the  group  stays  the  same  and  MCF 
returns  the  coordinate  of  the  minterm  to  the  main  algorithm,  and  all  nodes  follow 
the  same  branch.  If  there  are  two  or  more  minterms  with  the  same  smallest  CF, 
then  the  group  is  divided  into  two.  The  nodes  in  the  first  group  return  the 
coordinates  of  the  first  minterm,  while  the  second  group  returns  the  last  one.  All 
nodes  arrange  their  group  start,  end,  and  size  variables  accordingly.  After  the 
implicant  is  subtracted  in  the  main  algorithm,  the  main  algorithm  requests  another 
most  isolated  minterm  coordinate,  and  the  nodes  compute  the  new  working 
expression  set.  If  there  are  more  than  two  minterms  with  the  same  smallest  CF,  for 
the  first  group,  it  divides  into  two  groups  again  and  returns  the  coordinates  of  the 
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mintenns,  which  are  different  on  half  of  the  group.  The  same  procedure  is  applied 
to  the  other  half  of  the  first  group  which  follows  another  branch.  A  group  size  of 
1  indicates  that  we  do  not  have  nodes  for  further  division.  At  this  point,  the 
algorithm  returns  the  first  minterm’s  coordinates  to  the  main  algorithm  of  node 
program. 

MSy  ;  Original  Expression  Set 

WS  :  Working  Expression  Set 

SS  :  Solution  Set 

MAX  INT  :  Maximum  Integer  Number 


{ 

SS  *-  0;  /*  SS  =  Solution  Set  */ 
CUR_CF  —  MAX_INT 


CUR_CF2  -H-  MAX_INT 


mygroup  start  ♦-  0 
mygroup  size  *-  numnodes() 
mygroup  end  ♦-  mygroup  size  -  1 
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WS  =  MSf  =  {  a  |a  is  generated  by  the  function  /;  if  a  G  SAT  then  mark  its 
coordinate  }. 

While  WS  0  do  { 

1.  Use  algorithm  MCF  to  select  a  minterm  a  from  the  WS. 

2.  Use  algorithm  N  to  select  an  implicant  I„  that  covers  a. 

3.  SS  —  SS  u  I„. 

4.  Vj8  €  I,  do  { 

compute  g(/3)  *-  g(^)  -  g(a). 
subtracted  from  WS. 

if  P  is  originally  marked  and  g(^)  =  0  then  g{P)  *-  r. 

/*  don’t  care  terms  */ 

} 

} 

} 

ALGORITHM  MCF 
V  a  £  WS  do  { 

Compute  CF  /*  Compute  the  CF  for  minterm  a 

if  (CF  <  CUR  CF)  {  /*  if  CF  of  minterm  a  is  less  than  current  CF,  then 

CUR  CF  ♦-  CF  I*  assign  the  CF  to  CUR  CF  and  save  minterm  a’s 
savecoordl  *-  /*  coordinates  to  savecoordl 

} 
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elseif  (CF  =  CUR  CF)  {/*  if  CF  of  mintenn  a  is  the  same  with  current  CF 
CUR_CF2  CF  /*  then  assign  it  to  CUR_CF2  and  save  its 


savecoord2  ♦-  X,  /*  coordinates 

} 

} 

if  (CUR  CF  ^  CUR_CF2)  /*  if  saved  values  of  Cfs  are  not  the  same  then 

return(savecoordl)  /*  there  is  only  one  smallest  CF  and  return  its 
I*  coordinates 

/*  if  two  CUR  Cfs  are  the  same  then  we  have  a  tie 

/*  each  node  get  its  node  number  and  calculates  the  first  half  of  the  group 
I*  if  the  node  number  is  in  the  first  half  it  returns  the  first  coordinates 
r  and  reassigns  the  group  variables 
elseif  (mynode()  >  (mygroup_start+mygroup_size/2))  { 

mygroup  start  ^  (mygroup_start+  mygroup_size/2) 
mygroup  size  *-  mygroup_size/2 
retum(savecoord  1) 

} 

/*  if  the  node  is  not  in  the  first  half  it  returns  the  coordinates  of  the 
/*  second  mintenn  a  and  reassigns  the  group  variables  for  that  node 
else  { 

mygroup  end  mygroup  start  +(mygroup_size/2-l) 
mygroup  size  ♦-  mygroup_size/2 
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re  tu  rn  ( savecoord2) 


} 

} 


In  the  command  line  used  to  invoke  the  program,  there  are  three  flags 
that  can  be  set,  "-m”,  "-i"  and  "-o".  These  flags  allow  the  user  to  print  the  Karnough 
maps  (-m),  and  the  CF  of  the  minterm  and  NRC  of  the  implicant.  The  iPSC/2  uses 
a  concurrent  file  system  which  allows  each  individual  node  to  open  its  own  files  with 
node  number  as  suffix.  The  "-o"  flag  specifies  the  name  of  the  output  file.  These  files 
provide  the  execution  trace  to  the  user. 

The  main  algorithm  of  each  node  sends  a  message  to  the  host  program. 
This  message  includes  the  number  of  the  node  which  sends  the  message,  the  number 
of  implicants  which  is  minimized,  the  ratio  of  the  minimization  and  the  time  spent 
for  computation.  The  host  program  sorts  the  results  and  picks  the  result,  which  has 
the  maximum  ratio  as  the  solution.  The  computation  time  is  defined  as  the 
computation  time  of  the  node  which  spent  the  maximum  time. 

Example  5: 

Assume  we  have  a  8-node  cube.  Let  the  original  expression  be  sent  to  all  nodes 
by  message  passing  from  the  host.  At  the  beginning,  all  nodes  are  assigned  as  one 
group.  The  MCF  algorithm  on  the  nodes  finds  two  minterms  with  equal  smallest  Cfs 
[See  Figure  6.1].  The  nodes  #0-#3  assign  themselves  as  first  group  and  searches  for 
a  loosely  coupled  implicant  for  CF,  and  the  nodes  #4-#7  search  for  CF2. 
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Nodes  #0-#3  compute  three  equal  smallest  Cfs  (CFn,  CF,2  and  CF^)  and 
select  the  first  and  third  ones  for  searching.  They  divide  into  two  groups  again,  and 
the  first  group  which  consists  of  node  #0  and  #1  computes  two  more  CFs  (CF^j, 
CFu2)  Node  #0  follows  the  CF,n  and  finds  a  solution  after  finding  the  CFjji,.  This 
solution  is  the  same  as  the  solution  that  is  computed  by  ND  algorithm.  Node  #1 
searches  for  the  CF112  and  computes  another  CF  (CF^u).  the  group  is  out  of  nodes 
so  even  though  it  finds  more  than  one  CF  it  will  only  follow  the  first  one. 

Nodes  #4-#7  compute  CF21  and  Cjj,  CF21  leads  the  algorithm  to  an  optimum 
solution.  Node  #4  and  #5  compute  CF2,,  and  reaches  a  solution.  Alter  all  nodes  are 
finished  their  tasks,  they  all  report  their  solution  and  computation  results  to  the  host 
program.  TTie  host  program  selects  the  minimum  result  as  a  solution  and  the 
maximum  computation  time  as  the  computation  time  of  the  expression. 


Example  6: 

We  tested  100  2  variable  4  valued  expressions  using  the  ND  algorithm  and  the 
MCND  algorithm.  For  four  expressions,  the  MCND  algorithm  did  better  than  the 
ND  algorithm.  One  of  them  is  selected  as  an  example.  The  input  expression  to  be 
minimized  is  expressed  as; 


f  =  2 


X^-^l  ^x|+3  ^Xi  ^xi+1  °Xi  °X2^+2  °X^  ^xi  + 


1 


X, +1 


2v.3 


Xz'+l  ^Xi  °X2°+1 


2  0 


The  working  expression  set  is  initialized  to  MS^  and  the  original  expression  is 
represented  in  Figure  6.2.  The  CF  values  of  all  minterms  in  the  working  set  are 
computed.  CF  value  4  is  found  for  minterms  2  ^Xi^  ^Xz  and  1  °Xi  ^X2^.  The 
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Figure  6.2:  Original  expression  map  for  Example  6 


nodes  are  divided  into  two  groups.  The  first  group  follows  the  first  mintenn  and  the 
second  group  follows  the  second.  The  first  group  finds  only  one  smallest  CF  and 
computes  the  same  implicant.  WS^  has  a  tie  again  and  the  nodes  in  the  first  group 
are  divided  into  two.  Nodes  #0  and  #1  find  a  solution  consisting  of  6  implicants. 
This  solution  is  the  same  solution  as  ND  and  PND  algorithms  [See  Appendix  D]. 
The  nodes  #2  and  #3  find  a  solution  which  consists  of  5  implicants.  The  second 
group  of  nodes  is  not  divided,  i.e.  no  ties.  Nodes  #4  -  #7  find  the  optimum  solution 
with  4  implicants.  The  search  space  and  the  group  selections  are  shown  in  Figure 
6.3. 


The  solution  set  for  ND  and  PND  algorithms; 

f  =  2  ^xl  °X°  °X2+1  ^xl  °xl+l  ^xl  °X2^+1  ^xl  °X2+3  °Xi  ^X2 

The  optimal  solution  which  is  found  by  MCND; 

f  =  2  °Xi  °X2+1  ^X2+3  ^Xi  ^xi+3  °xl  ^xl 

As  can  be  seen,  the  MCND  algorithm  finds  a  better  solution  than  the  PND  and  ND 


algorithms.  The  selected  minterms  and  implicants  are  reported  in  Appendix  D. 
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SEARCH  TREE  FOR  EXAMPLE  6 


MS 


I 


CFllllll 

6IMPUCANT 
SOLUTION 


Figure  63:  Search  tree  for  Example  6 
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VI.  SUMMARY  AND  CONCLUSIONS 


As  can  be  recalled  from  Chapter  III  Section  C,  in  order  to  derive  full  benefit 
of  parallel  processing,  certain  requirements  must  be  met.  Two  nodes  should  halve 
the  time  needed  by  a  single  node.  But  this  is  possible  only  for  node  programs  that 
are  running  completely  independently  on  different  nodes  provided  that  no 
communication  time  is  required. 

The  ND  Algorithm  runs  sequentially.  Only  until  the  selection  and  subtraction 
of  an  implicant  from  the  working  expression  set,  can  the  algorithm  proceed  to 
compute  another  implicant.  The  updating  of  the  working  expression  set  should  be 
completed  to  continue  the  computation.  Only  the  clustering  factor  computation  was 
amenable  to  parallel  execution,  but  this  brought  in  the  problem  of  communication. 
Our  system  was  a  distributed  memory  system;  nodes  cannot  access  the  data  for  the 
expressions  from  a  shared  memory  location.  All  of  the  information  about  the 
expression  and  the  coordinates  of  the  implicant  should  be  passed  to  the  nodes  by 
using  messages  and  this  should  be  done  for  each  and  every  one  of  clustering  factor 
computation  requests.  Clustering  factor  computation  does  not  consist  of  a  large  part 
of  the  dynamic  code  and  the  communication  time  is  increased,  while  the  number 
of  terms  and  inputs  are  increased. 
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We  obtained  a  speed-up  of  two  in  all  cases.  This  speed-up  gives  us  an 
advantage  in  computing  the  MVL  expressions  compared  to  all  other  heuristics. 

The  PND  Algorithm  is  a  faster  ND  algorithm.  The  ND  algorithm  is  a 
heuristic, i.e.  it  finds  a  near  minimal  solution,  not  an  exact  solution.  Improving  the 
ND  algorithm  can  be  done  in  two  ways;  a  recursive  ND  algorithm  or  a  concurrent 
ND  algorithm.  We  chose  the  concurrent  algorithm,  because  a  recursive  algorithm 
would  need  too  much  computation  time.  The  Multi-branch  Concurrent  ND 
Algorithm  is  expected  to  spend  less  time  to  compute  the  solution  compared  to  a 
recursive  sequential  algorithm.  We  expect  the  recursive  algorithm  will  have  a 
computation  time  of 

numnodesO  -1 

5^  computation_time{nodeno) 

nodeno^O 

The  MCND  algorithm  uses  only  two  message  passing  instructions;  the  first  one 
broadcasts  the  expression  to  the  nodes  and  the  second  one  collects  the  results  from 
the  nodes.  Because  all  results  come  in  different  times,  the  time  spent  for  receiving 
the  messages  for  the  nodes  is  small.  The  MCND  algorithm  realizes  the  minimum 
communication  time  and  maximum  computation  time.  Even  though  the  MCND 
algorithm  is  still  a  heuristic,  the  results  are  very  close  to  the  exact  solution. 
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APPENDIX  A:  PND  ALGORITHM  PROGRAM  LISTINGS 


PARDEF.H 


r 


This  file  provides  additional  structures  which  is  defined 
in  pardef-h  file.  The  structures  defined  in  this  file  are 
only  used  by  ndpar.c,  cfjeftc  and  cf_rightc. 


- 

#define  MSG_TYPE1  1/*  This  msg  type  is  for  sending 

messages  to  the  nodes 

#define  MSG_TYPE2  2/*  This  msg  type  is  for  receiving 

messages  fi-om  the  nodes  *! 

#define  HOST_PID  10/*  process  id  for  the  host  */ 

#define  NODE_PID  0/*  process  id  for  the  node  process  */ 
#define  NVAR  3/*  number  of  variables  in  expr  */ 

#define  NTERM  100/*  2*number  of  terms  in  expr  */ 


typedef  short  msg_coord;  /*buffer  for  coord  of  minterm  */ 


typedef  struct  {  /*  buffer  for  upper  and  lower  */ 

short  lower,/*  limits  of  terms*/ 

upper; 

}msg_bound; 


typedef  struct  { 
msg_bound 
short 

rbc; 

}msg_implicant; 


/*  buffer  for  implicant*/ 
B[NVAR]; 
coeff, 


typedef  struct  {  /*  buffer  for  expression*/ 

msg_implicant  I[NTERM]; 
short  radix. 
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nvar, 

nterm; 

}msg_expression; 

typedef  struct  {  /*  buffer  for  whole  data  to  be  */ 

msg_expressioii  E;  /*  sent  to  nodes  */ 
msg_coord  X[NVAR+2]; 
int  node_no, 

radix, 
nvar, 

All_Trun, 

value_msg[2]; 

}nisg_to_node; 


typedef  struct  {  /♦  buffer  for  msg  from  the  node  */ 

int  ea, 

dea; 

}msg;_from_node; 
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NDPAR.C  (HOST  PROGRAM) 


#include  "defs.h" 

#include  <cube.h> 

#iiiclude  "pardef.h" 

/*  Parallel  Neighborhood  Decoupling  Algorithm  by  Oral  &  Yang  */ 


ND  PAR() 
- 


:function: 

-  Perform  the  Parallel  Algorithm  on  the  input  expression 
:algorithm: 

Start  with  a  working  copy  E_work  of  the  original 
function  E  orig; 

Initialize  a  final  function  E_final; 

While  (there  are  still  minterms  to  pick)  { 

Pick  a  minterm  X  from  Ejvork; 

Pick  the  best  implicant  I  for  X; 

Subtract  I  from  E_work; 

Add  I  to  E  final; 

} 

:globals; 

E_orig 
e_flag 
m_flag 
q_flag 
G_flag 
FO_ratio 
:side  effects: 

"STAT 

HEUR 

E_work 

E_final[] 

.calledby: 

main() 
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:calls: 

deallcx;_expr() 

dup_expr() 

print_terms() 

priiit_map() 

miin() 

pick_iniplicant() 

subtract_implicaiit() 

print_source() 

- V 


{ 

register  i; 

int  nuin_iinpl  =  0, 

better  found; 
int  *X; 

Implicant  *1; 
float  ratio; 

if  (E_final[N_P].I  !=  NULL) 
dealloc_expr(&E_final[N_P]); 

#  ifdef  ANALYZER 
STAT  =  &NP_stat; 

#  endif 

HEUR  =  N_P; 
dup_expr(&E_work,&E_orig); 

E_&al[HEUR].ntenn  =  0; 

E_final[HEUR].radix  =  E_orig.radix; 

E  finaI[HEUR].nvar  =  E_orig.nvar, 
E~final[HEUR].I  =  NULL; 

if(!load_flag)  { 

se^id(HOST_PID); 

for  (i=0  ;  i  <  (nuninodes()/2)  ;i++)  { 

Ioad("Aisr/oral/onurpar/mvlcpar/cfJeft",i,0); 

load("Aisr/oral/onurpar/mvlq)ar/cf_right", 

i + (numnodes()/2),0); 

} 

load  flag  =  1; 

} 

#  ifdef  ANALYZER 
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if  (e_flag) 

print_tenns(&E_orig); 
if  (m_flag)  { 

printf("  Orig  map  (ND_PAR):\n"); 
priiit_map(); 

} 

#  endif 

better_found  =  0; 

resource_used(START); 
for  (;;)  { 

if  ((X  =  mim(&E_work))  =  =  NULL)  { 
if  (num_impl  <  E_orig.iitenn) 
better_found  =  1; 
break; 

} 

I  =  pick_implicaiit(X); 
num_impl+  +  ; 
subtract_implicant(I); 

#  ifdef  ANALYZER 
if  (i_flag) 

print_unplicant(X,I); 
if  (m_flag) 

print_map(); 

#  endiJf 

if  (Sm_flag)  { 

if  (num_impl  >  =  E_orig.ntenn) 
break; 

} 

} 

resource_used(STOP); 

if  (!better_found)  { 

num_impl  =  E_orig.ntenn; 
dup  expr^&(E_final[N_P]),&E_orig); 

} 

ratio  =  ((double)num_impl/(double)E_orig.ntenn); 


#  ifdef  ANALYZER 
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if  (!q_flag  &&  !G_flag)  { 
if  (Ibetter  found) 

prmtf("%-4d  ND_PAR:  %4.2f  %61d:%3.31d\n”. 

expr_seq,num_impl,nuni_impI,0.0,secs_used(),tsecs_used()); 
else 

printf("%-4d  ND_PAR:  %4d/%-4d  %4.2f  %6d:%3.31d\n", 
expr_seq,nuni_impl,E_orig.ntenn, ratio, 
secs_used(),tsecs_used()); 

} 

#  endif 

dealloc  expr(&E_work); 

} 


static  int  *inim(E) 

Expression  *E; 

/* - 

rfunction: 

-  Find  the  Most  Isolated  Minterm  in  the  expression  pointed  to  by  £,  and 
return  its  coordinates  as  a 

vector. 

•  Local  to  ndpar.c 
:globals: 
radix 
nvar 

:side  effects: 

"STAT 
rcalled  by: 

W_PAR() 

:calls: 

next_coord() 

eval() 

vcopyO 

:retums: 

-  A  vector  of  integers  representing  the  coordinate  of  the 
most  isolated  minterm,  or  NULL  if  no  more  minterms. 

-  The  value  at  that  location  is  also  returned  as  the  last 
integer  in  the  vector. 


register  i,j,k; 

int  cur_val  =  E->  radix, 
cur_CF  =  MAX_INT, 
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X_orig[MAX_VAR + 2], 

R_1  =  radix  -  1, 

Not_aD  =  0, 

All_trun  =  0, 

TRUN  =  2*R_1. 
last  =  0, 
value  [2], 

ct 

ea, 

dea, 

term; 

int  *X,'*next_coord(); 
static  int 

coord[MAX_VAR+2], 
save_coord[MAX_VAR + 2]; 
msg_to_node 

msg_to_node_cf; 

nisg_froni_node 

msg_£rom_node_cf; 

#  ifdef  ANALYZER 
STAT->calls_mim+ +; 

#  endif 

for  (i=0;i  <  E_work.ntenn;i++)  { 

msg_to_node_cf.E.I[i].coeff  =  E->I[i].coeff; 
nisg;_to_node_cf.E.I[ij.rbc  =  E->I[i].rbc; 
for  (j=0;j  <  nvar;j++)  { 

msg_to_node_cf.E.I[i].B[j].upper=E->I[i].B[j].upper; 
msg_to_node  cf.E.I[ij.B[j].lower=E->I[i].Bjj].lower; 

} 

} 

msg_to_node_cf.E.radix  =  radix; 
msg_to_node_cf.E.nvar  =  nvar; 
nisg_to_node_cf.E.nterm  =  E_work.nterm; 
msg_to_nodc_cf.AlI_Tnin  =All_trun; 

for  (tenn=0;  term  <  E  orig.nteim;  teim++)  { 
k=  1; 

while  ((X=next__coord(coord,&(E->I[teim]),k))  !=  NULL)  { 
vcopy(value,eval(E__work,X)); 
if  (value[EVAL]  &&  value[EVAL]  <  radix)  { 
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cf  =  0; 
dea  s  0; 
ea  =  0; 

if  (!value[HLV]) 

Not^all  =  1; 

msg_to_node_cf .  All_Tru  n = All^trun; 
for  (i=0;i  <  nvar+2;i+  +  )  msg_to_Bodej:f.X[i]=  X[i]; 
vcopy(msg;_to_node_cf.value_msg, value); 
csend(MSG_TWEl,&msg;_^to_node_cf,sizeof(msg_to_iiode_cf),-l,0); 

crecv(MSG_TYPE3,&insg_from_node_cf,si2eof(msg_from_node_cf)); 

cf  =  (insg_from_node_cf.dea  *  R_l)  + 
msg_froin_nc>de_cf.ea; 

if  (!(value[HLV]  &&  cf  >  TRUN)) )  All_trun)  { 
if  (cf  c  cur_CF)  { 

cur_val  *  value[EVAL]; 
cur_CF  =  cf; 

for7i=0;  i  <  nvar;  i++)  save  coord[i]  =  X[i]; 

} 

} 

} 

k  *  0; 

} 

if  (Hast  &&  (term  ==  (E_orig.ntenn  -  1))  &&  !Not_alI)  { 
All_tnin  =  1; 
cur_CF  =  MAX^INT; 
term  =  -1; 
last  ~  1; 

} 

} 

if  (cur^CF  ==  MAX_INT) 
retuni(NULL); 

savejcoord[nvar+l]  =  cur_CF; 
save_coord[nvar]  =  cur_val; 

retum(save_coord); 


static  int  valid_implicant(I) 
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Implicant  *1; 

/* - 

:function: 

-  Decide  upon  the  validity  of  implicant  I 

-  Local  to  ndpar.c 
:globals: 

E_work 
E_orig 
:side  effects; 

"STAT 

:called_by: 

pick_unplicant() 

;calls: 

next_coord() 

eval() 

vcopyO 

:returns: 

1  if  a  valid  implicant 
0  if  not 


int  ’X; 

int  init  =  1; 

int  R_1  =  radix  -  1; 

int  value  =  I->coeff; 

int  Vo[2],Vw[2]; 

static  int 

coord[MAX_VAR + 2]; 

#  ifdef  ANALYZER 

STAT-  >calls_valid_implicant+  -t- ; 

#  endif 

while  ((X  =  next_coord(coord,l,init))  ?=  NULL)  { 
init  =  0; 

vcopy(Vw,eval(&E_work,X)); 

vcopy(Vo,eval(&E_orig,X)); 

if  (((Vw[EVAL]  <  value)  &&  !Vw[HLV])  &&  (Vo[EVAL]  <  R_l)) 
retum(O); 

} 

retum(l); 

} 
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static  int  coinpute_rbc(I) 

Implicant  *1; 

r - 

rfunction: 

-  Compute  the  RBC  for  the  given  implicant 
•  Local  to  ndpar.c 

rglobals: 

radix 

nvar 

:side_efiFects: 

STAT 

:called_by: 

pick_implicant() 

rcalls: 

next_coord() 
eval() 
vcopyO 
:  returns: 

-  an  integer  RBC 


int  *X; 

int  I_value  =  I->coe£f; 

register  i; 

int  value[2], 

R_1  =  radix  -  1, 

neighbor_value[2], 

good, 

bad, 

diff, 

equal, 

neigboun, 

first, 

rbc  =  0, 

init  =  1; 

static  int 

coord[MAX_VAR+2]; 

#  ifdef  ANALYZER 
STAT->calls_compute_rbc+  +; 

#  endif 

I*  for  each  coordinate  in  the  implicant ...  */ 
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while  ((X  =  next_coord(coord,I,init))  !=  NULL)  { 
init  =  0; 
equal  =  0; 

vcopy(value,eval(&E_work,X)); 
if  (valuejBVAL]  =  =  radix) 
continue; 

diff  =  value[EVAL]  -  I_value; 
first  =  1; 

/*  for  each  direction  ...  */ 
for  (i=0;  i  <  nvar;  i++)  { 
good  =  0; 
bad  =  0; 

if  ((diff  <  =  0)  &&  first)  { 
good  =  2; 
first  =  0; 

} 

/*  if  there  is  a  left  neighbor,  examine  it  */ 
if  (X[i]  !=  0  &&  X[i]  ==  I->B[i].lower)  { 

X[i]-; 

vcopy(neighbor_value,eval(&E_work,X)); 
neig_boun  =  neighbor_value[EVAL]  -  value[EVAL]; 
X[i]+  +  ; 

if  (neighbor  value[EVAL]  !=  0)  { 
if  (!neighborvalue[HLV]  1 1  !value[HLV])  { 
if  (neighbor_value[EVAL]  <  di^  { 
if  (neighbor_vaIue[HLV]) 
good  +=  1; 
else 

bad  +-  2; 

} 

if  (neighbor_value[EVAL]  >  diff)  { 
if  (Ineig  boun) 

bad  +=  2; 

if  (neighbor_value[HLV]  &&  neig  boun  <  0) 
bad  +=  2; 

if  (diff  >  0  &&  neig  boun)  { 
if  (value[HLV]) 
good  +=  1; 
else 

bad  +=  2; 

} 

} 

else  { 
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if  (neighbor_value[HLV]  1 1  value[HLV]) 

good  +=  1; 

else 

good  +=  2; 

} 

} 

} 


I*  if  there  is  a  right  neighbor,  examine  it  */ 
if  (X[i]  !=  R  1  &&  X[i]  ==  I->B[i].upper)  { 

X[i]++T 

vcopy(neighbor_value,eval(&E_work,X)); 

neig  boun  =  neighbor_value[EVAL]  -  valuefEVALl; 

X[i]~; 

if  (neighbor_value[EVAL]  !=  0)  { 
if  (!neighbor_value[HLV]  1 1  !value[HLV])  { 
if  (neighbor_value[EVAL]  <  di^  { 
if  (neighbor_value[HLV]) 
go^  +=  1; 
else 

bad  +=  2; 

} 

if  (neighbor_value[EVAL]  >  difif)  { 
if  (!neig_boun) 

bad  +=  2; 

if  (neighbor_value[HLV]  &&  neig  boun  <  0) 
bad  +=  2; 

if  (diff  >  0  &&  neig_boun)  { 
if  (value[HLV]) 
good  +=  1; 
else 

bad  +=  2; 

} 

} 

else  { 

if  (neighbor_value[HLV]  1 1  value[HLV]) 

good  +=  1; 

else 

good  +=  2; 

} 

} 

} 


62 


} 


/*  update  the  rbc  */ 
rbc  =  (rbc  -  good)  +  bad; 

} 

} 

retum(rbc); 


static  Implicant  *pick_implicant(X) 
int  *X; 

/♦ - 

:function: 

-  Pick  the  best  implicant  for  minterm  X 
:globals: 

radix 

:side_effects: 

STAT 

:called_by: 

ND_PAR() 

:calls: 

init_implicant() 

gen_bounds() 

next_implicant() 

eval() 

vcopyO 

compue_rbc() 

copy_impIicant() 

vaiid_implicant() 

rretums: 

-  A  pomter  to  a  term  representing  the  best  implicant. 


int  cur_rbc  =  MAX_INT, 
rbc  =  0, 

I_value, 

i, 

init  =  1, 
first  =  1; 

Implicant  *1; 
static  int 

coord[MAX_VAR+2]; 
static  Bound  I_bound[MAX_VAR+2]; 
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static  Implicant  I_best; 

Bound  *B; 
int  V[2], 
value[2]; 

#  ifdef  ANALYZER 
STAT->calIs_pick_implicant+  +; 

#  endif 

I_bestB  =  I_bound; 
mit_implicant(X); 

B  =  gen_bounds(X); 
vcopy(V,eval(&E_orig,X)); 
while  ((I  =  next_implicant(B))  !=  NULL)  { 
if  (V[HLV])  { 

for  (I->coeff=X[nvar];  I->coeff  <  radix; 
(I->coeff)++)  { 
if  (valid_implicant(I))  { 

rbc  =  compute_rbc(I); 
if  (first) 
rbc  =  2; 
else 

rbc  +=  2; 

if  (rbc  <  =  cur^rbc)  { 
cur_rbc  =  rbc; 
l->rbc  =  rbc; 
copy  implicant(&;I  best,I); 

} 

} 

} 

first  =  0; 

} 

else  { 

I->coe£f  =  X[nvar]; 
if  (valid_miplicant(I))  { 

rbc  =  coinpute_rbc(I); 
if  (first)  { 

first  =  0; 
if  (rbc  <  0  ) 
rbc  =  1; 
else 

rbc  +=  2; 

} 
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else 


} 


rbc  +=  2; 

if  (rbc  <  =  cur_rbc)  { 
cur_rbc  =  rbc; 

I->rbc  =  rbc; 
copy_implicant(&I_best,I); 

} 

} 

} 

} 

retuni(&I_best); 


NODE  PROGRAM  LISTINGS 


CF_LEFT.C  (NODE  PROGRAM) 

#mclude  "defs.h" 

#include  "pardef.h" 

#include  <cube.h> 


main()  { 
int 

expanded, 
var  no, 

"  vall[2]; 
long  ea[2], 

workl[2]; 

msg_to_node 

msg_to_node_cf, 

msg_froni_node 

msg_£roni_node_cf; 

for  (;;)  { 

ea[0]  =  0; 
ea[l]  =  0; 
expanded  =  0; 

crecv(MSG_TYPEl,&nisg_to_node_cf,sizeof(msg_to_node_cf)) 

var_no=mynode(); 
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if  (var_no  <  msg_to_node_cf.E.nvar)  { 


} 


while  (msg_to_node_cf.X[var_no]  >  0)  { 
msg_to_node_cf.X[var_no]-; 

vcopy(vall,eval(msg  to_n^e_cf.E,msg_to_node_cf.X)); 
if  (vall[EVAL]  &&  (vallpVALJ  <  = 

msg_to_node_cf.value_msg[EVAL] 

1 1  nisg_to_node_cf.value_msg[HLV]))  { 

expanded=l; 

ea[0]H'+; 

} 

else  break; 

} 

if  (expanded)  ea[l]++; 

gisum(&ea[0],2,&workl[0]); 
if  (mynodeO  =  =  0)  { 

msg_froin_node_cf.ea  =  ea[0]; 
nisg_from_node_cf.dea  =  ea[l]; 

csend(MSG_TVPE3,&insg_from_node_cf,sizeof(msg_from_node_cf), 
myhost(),HOST  PID); 

} 


} 

int  *eval(E,X) 
msg_expression  E; 
short  X[NVAR]; 

/* - 

ifunction: 

-  Evaluate  the  expression  at  X,  where  X  is  a  vector  of 
coordinates 
rretums: 

-  A  vector  with  the  value  of  the  expression  at  the 

specified  coordinate  as  its  first  element,  and  a  flag 
set  if  this  value  has  attained  the  highest  logic  value 
(HLV) 

- V 

{ 

register  i,j,k; 

int  out_of_bounds; 

static  int  V[2]; 


66 


register  nnl  =  E.radix-1; 


V[EVAL]  =  0; 

V[HLV]  =  0; 

/*  for  each  term  ...  */ 
for  (i=0;  i  <  E.ntenn;  i++)  { 
r  for  each  variable  ...  */ 
for  (j=0,out_of_bounds=0;  j  <  E.nvar;  j++)  { 

(X[j]  <  E.I[i].B[j].lower)  1 1 
(X[j]  >  E.I[i].B[i].upper) 

){ 

out_of_bounds  =  1; 
break; 

} 

} 

if  (out_of_bounds) 
continue; 

/*  if  this  is  a  don’t  care,  return  the  radix  */ 
if  (E.I[i].coeff  =  =  E.radix)  { 

V[EVAL]  =  E.radix; 
retum(V); 

} 

V[EVAL]  +=  E.I[i].coe£f; 
if  (V[EVAL]  >=  rml)  { 

/*  set  a  flag  which  means  E_orig  was  saturated  at  this  X  */ 
V[HLV]  =  1; 

} 

if  (V[EVAL]  >  rml)  { 

V[EVAL]  =  rml; 

} 

else  if  (V[HLV]  &&  (V[EVAL]  <=  0))  { 

V[EVAL]  =  E.radix; 
retum(V); 

} 

} 

retum(V); 
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vcopy(d,s) 
int  *d,*s; 

{ 

d[0]  =  s[0]; 
d[l]  =  s[l]; 


68 


CF_RIGHT.C  (NODE  PROGRAM) 


#include  "defe.h" 
#iiiclude  "pardef.h” 
#mclude  <cube.h> 


mainO  { 
int 

expanded, 
var  no, 
valip]; 

long  ea[2], 

workl[2]; 

msg_to_node 

msg_to_node_cf; 

msg;_from_node 

nisg_from_node_cf; 

for  (;;)  { 

ea[0]  =  0; 
ea[l]  «  0; 
expanded  =  0; 

crecv(MSG_TYPEl,&msg_to__node_ctsizeof(nisg_to_node_cf)); 
var_no=mynode()  -  (nuninodes()/2); 

if  (var_no  <  msg;_to_node_cf.E.nvar)  { 

while  (msg_to_node_cf.X[var_no]  <  ((msg_to_node_cf.E.radix)-l))  { 
msg_to_node_cf.X[var_no] + + ; 

vcopy(val  l,eval(msg_to_node_cf.E,msg_to  node_cf.X)); 
if  (vall[EVAL]  &&  (vall[EVAL]  <  =  " 

nisg_to_node_cf.value_msg[EVAL] 

1 1  msg_to_node_cf.value_msg[HLV]))  { 

expanded  =1; 
ea[0]++; 

} 

else  break; 

} 

if  (expanded)  ea[l]++; 
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gisuin(&ea[0],2,&workl[0]); 

} 

} 

int  *eval(E,X) 
msg_expression  E; 
short  X[NVAR]; 

/* - Tfunction: 

-  Evaluate  the  expression  at  X  where  X  is  a  vector  of 
coordinates 

:retums: 

-  A  vector  with  the  value  of  the  expression  at  the 
specified  coordinate  as  its  first  element,  and  a  flag 
set  if  this  value  has  attained  the  highest  logic  value 
(HLV) 

- */ 

{ 

register  i,j,k; 

int  out_of_bounds; 

static  int  V[2]; 

register  nnl  =  E.radix-1; 


V[EVAL]  =  0; 

V[HLV]  =  0; 

/*  for  each  term  ...  */ 
for  (i=0;  i  <  E.nterm;  i++)  { 
r  for  each  variable  ...  */ 
for  (j=0,out_of_bounds=0;  j  <  E.nvar;  j++)  { 

( 

(X[i]  <  E.I[i].B[j].lower)  1 1 
(X[j]  >  E.I[i].B[i].upper) 

){ 

out_of_bounds  =  1; 
break; 

} 

} 

if  (out_of_bounds) 
continue; 
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/*  if  this  is  a  don’t  care,  return  the  radix  */ 
if  (E.I[i].coeff  =  =  E.radix)  { 

V[EVAL]  =  E.radix; 
retum(V); 

} 

V[EVAL]  +=  E.I[i].coeff; 
if  (V[EVAL]  >=  nnl)  { 

/*  set  a  flag  which  means  E  orig  was  saturated  at  this  X  *! 
V[HLV]  =  1; 

} 

if  (V[EVAL]  >  rml)  { 

V[EVAL]  =  nnl; 

} 

else  if  (V[HLV]  &&  (V[EVAL]  <=  0))  { 

V[EVAL]  =  E.radix; 
retum(V); 

} 

} 

retum(V); 


vcopy(d,s) 
int  •d,*s; 

{ 

d[0]  =  s[0]; 
d[l]  =  s[l]; 
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APPENDIX  B:  MCND  ALGORITHM  PROGRAM  LISTINGS 


PARDEF2.H 

#define  MSG_TYPE1  1 

#define  MSG_TYPE2  2 

#define  HOST  PID  10 

#define  NODE  PID  0 

#define  NVAR"  2 

#define  NTERM  10 

typedef  short  msg_coord; 

typedef  struct  { 

short  lower, 

upper; 

}msg_bound; 
typedef  struct  { 

insg_bound  B[NVAR]; 
short  coeff, 

rbc; 

}msg_implicant; 
typedef  struct  { 

msg_iinplicaDt  I{NTERM]; 
short  radix, 

nvar, 

Dtenn; 

int 

i_flag, 

m.flag; 

char  of_file[MAX_PATH+l]; 
}insg_expression; 

typedef  struct  { 
float  ratio; 
int  num_impl, 
node_no; 
long  secs, 
msecs; 

}msg_from_node; 
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MCND  Algorithm  Host  Program  Listings 


#mclude  "de£s.h" 

#include  <cube.h> 

#include  "parde£2.h" 

/*  Multi-branch  Concurrent  Algorithm  (Host)  by  Oral  &  Yang 


OPT_ND() 

/* - 

:function: 

-  Perform  the  MCND  Algorithm  on  the  input  expression 

- V 

{ 

register  i,j; 
int  num_impl  =  0; 

float  ratio; 

msg_expression 

msg;_to_node; 

msg_from_node 

msg;_i^om  node  first; 
if  (E_final[0_N].f  !=  NULL) 
dealloc_expr(&E_final[0_N]); 

#  ifdef  ANALYZER 
STAT  =  &ON_stat; 

#  endif 

HEUR  =  0_N; 

E_final[HElJR].nterm  =  0; 

E_final[HEUR].radix  =  E_orig.radix; 
E_final[HEUR].nvar  =  E_orig.nvar; 

E_final[HEUR].I  =  NULL; 

if(!load_flag)  { 

setpid(HOST_PID); 

load("A«sr/oral/onurpar2/mvlcpar/opt_nd_n",-l,0); 
load_flag  =  1; 

} 

#  ifdef  ANALYZER 
if  (e_flag) 
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print_terms(«&E_orig); 


#  endif 

msg_to_node.nvar  =  E_orig.nvar; 
msg_to_node.iitenn  =  E_orig.nterm; 
msg_to_ncxie.radix  =  E_orig.radix; 
mGg_to_node.i_flag  =  i_flag; 
msg_to_node.in_£lag  =  in_flag; 
strq)y(msg_to_node.of_file,of_file); 
for  (i=0;i  <  E_orig.nterm;i++)  { 

msg_to_node.I[i].coeff  =  E_orig.I[i].coeff; 
msg_to_node.I[i].rbc  =  E_orig.I[i].rbc; 
for  (j=0;j  <  E_orig.nvar;j+  +  )  { 

insg_to_node.l[i].B[j].upper  =  E_orig.I[i].B[j].upper; 
msg_to_node.I[i].BIj].lower  =  E_orig.I[i].B[j].lower; 

} 

} 

csend(MSG_TYPEl,&msg_to_node,sizeof(insg_to_node),-l,0); 
for  (i=0;i  <  nuinnodes();i++)  { 

crecv(MSG_TYPE2,&msg_from_node_first,sizeof(msg_from_node_first)); 

priiitf("%-4d  OPT_PAR:  %4d/%-4d  %4.2f  %6d:%3.31d  From  node:  %d\n", 
expr_seq,msg_from_node_first.Dum_impl,E_orig.ntenn, 
msg_from_node_first.ratio,msg_from_node_first.secs, 
msg  from  node  first.msecs,msg  from  node  first.node_no); 

} 

dealloc_expr(&E_work); 
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MCND  Algorithm  Node  Program  Listings 

#include  "defs.h" 

#include  "pardef2.h" 

#include  <cube.h> 

#include  <fcntl.h> 

/*  Global  data  structures - */ 

/*  Logic  expressions: 

E_orig 

-  holds  the  original  input  expression  as  parsed 
E_work 

-  a  copy  a  E_orig 

-  implicants  are  subtracted  from  this  expression  as  terms 
during  the  coures  of  optimization 

E_final[] 

-  the  result  expression  (starts  out  empty) 

-  each  term  is  one  implicant  found  during  optimization 

-  each  heuristic  has  its  own  E  final  (for  comparison) 

V 


Expression 

E_orig  =  {  0,0,0, NULL  }, 

E_work  =  {  0,0,0,NULL  }, 

E  final[5]  =  { 

{  0,0,0,NULL  }, 

{  0,0,0,NULL  }, 

{  0,0,0,NULL  }, 

{  0,0,0,NULL  }, 

{  0,0,0,NULL  } 

}; 

int  HEUR;  /*  Current  heuristic 

*  HEUR  indexes  into  E_final[] 

*  depending  upon  the  currently 

*  active  heuristic 
*/ 

int  FINAL;  /*  Index  of  the  selected  final 
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*  expression 

V 

long  niygroup_start, 
mygroup_end, 
mygroup_size; 
int  fd; 
char  msg[100]; 

/*  Multi-branch  Concurrent  ND  algorithm  for  a  node  by  Oral  &  Yang 


function: 

-  Performs  the  MCND  algorithm  on  a  node 
algorithm: 

Receive  original  expression  set  from  host 

Start  with  working  copy  E_work  of  the  original  function  E_orig 

Initialize  a  final  function  E_final 

While  (there  are  still  minterms  to  pick)  { 

Pick  a  minterm  X  from  E_work 
Pick  the  best  implicant  I  for  X 
Subtract  I  from  E_work 
Add  I  to  E_final 

} 

- */ 


main() 

{ 

register  i,j; 

int  num_impl, 

better_found, 
expr_seq  =  0; 

static  char  cfs[4]  =  "###"; 
int  *X; 

Implicant  *1; 

double  ratio; 

unsigned  long  T1,T2, 
time; 

msg_expression 

msg_to_node; 

msg_from_node 

msg_from_node_first; 

for  (;;)  { 
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expr_seq++; 
num_impl  =  0; 
mygroup_start  =  0; 
mygroup_size  =  numnodes(); 
mygroup_end  =  mygroup_size  -  1; 

crecv(MSG_TYPEl,&msg_to_node,sizeof(msg_to_node)); 

if  ((insg_to_node.i_£lag  |  msg_to_node.m_flag)  )  { 
strcat  (msg  to  node.ofjBle.cfs); 

^  fd  =  open(msg_to_node.of_£ile,0_CREAT  |  0_RDWR  |  O  APPEND,  0644); 

dup_expr(&E_orig,&nisg_to_node); 

if  (E_final[0_N].I  !=  NULL) 
dealloc_expr(&E_final[0_N]); 

HEUR  =  0_N; 

dup_expr(&E_work,&msg_to_node); 

E_final[HEUR].ntenn  =  0; 

E_final[HEUR].radix  =  E_orig.radix; 

E_final[HEUR].nvar  =  E_orig.nvar; 

E_final[HEUR].I  =  NULL; 
if  (msg_to_node.in_flag)  { 
sprintf(msg,”  Orig"map(OPT  ND):\n"); 
cwrite(fd,msg,strlen(insg)); 
print  map(); 

} 

better_found  =  0; 

T1  =  mclock(); 
for  (;;)  { 

if  ((X  =  mim(&E_work))  =  =  NULL)  { 
if  (nuni_impl  <  E_orig.ntenn) 
better_found  =  1; 
break; 

} 

I  =  pick_implicant(X); 
nuni_impl+  +  ; 
subtract_implicant(l); 
if  (msg  to  node.i  flag) 
print_iniplicant(X,  I); 
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if  (insg_to_node.in_£lag) 
print_map(); 

} 

T2  =  mclockO; 
time  =  T2  -  Tl; 

if  (!better_found)  { 

num_impl  =  E_orig.nterm; 
dup_expr(&(E_final[0_N]),&E  orig); 

} 

ratio  =  ((double)num_impI/(double)E_orig.nterm); 
if  (ratio  =  =  1)  ratio  =  0; 

msg_from_node_first.  ratio = ratio; 
msg_from_node_firstnode_no=mynode(); 
msg_from_node_first  num_impl = numimpl; 
msg_from  iiode_first.secs=  time  /  1000; 

msg_from_node_first.msecs  =  time  -  (msg_from_node_first.secs  *  1000); 

csend(MSG_TYPE2,«&msg_from_node_first,sizeof(msg_from_node_first),myhost() 

,HOST_PID); 

if  (msg  to  node.i_flag  |  msg_to_node.m_flag)  { 

sprintf(msg,"%-4d  OPT  PAR:  %4d/%-4d  %4.2f  %6d:%3.31d  From  node: 
%d\n", 

expr_seq,num_impl,E_orig.nterm,ratio,msg_from_node_first.secs, 

msg_from_node_firstmsecs,mynode()); 

cwrite(fd,msg,strlen(msg)); 

} 

dealloc_expr(&E_work); 

close(fd); 

} 

} 


static  int  *mim(E) 

Expression  *E; 

r - 

:function: 

-  Find  the  Most  Isolated  Minterm  in  the  expression  pointed  to 
by  E,  and  return  its  coordinates  as  a  vector. 

-  Lxx:al  to  opt  nd  n.c 
'.globals: 
radix 
nvar 
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:side_effects: 

STAT 

:called_by: 

main() 

:calls: 

next_coord() 

eval() 

vcopyO 

:returns: 

-  A  vector  of  integers  representing  the  coordinate  of  the  most 
isolated  mintenn,  or  NULL  if  no  more  minterms. 

-  The  value  at  that  location  is  also  returned  as  the  last  integer 
in  the  vector. 

-  if  there  is  a  tie  (more  than  one  smallest  CF  value)  it  returns 
first  and  last,  and  divides  the  nodes  into  two  groups. 


register  i,j,k; 

int  cur_val  =  E->  radix, 
cur_val2  =  E->  radix, 
cur_CF  =  MAX_INT, 
cur  CF2  =  MAX  INT, 
X_orig[MAX_VAR+2], 

R_1  =  E_orig.radix  -  1, 

Not_all  =  0, 

All  trun  =  0, 

TRUN  =  2*R_1, 

last  =  0, 

expanded, 

vaiue[2], 

vall[2], 

val2[2], 

cf, 

ea, 

dea, 

term; 

int  *X,*next_coord(); 
static  int 

coord[MAX_VAR + 2], 
save_coord  1  [MAX_VAR + 2], 
save_coord2[MAX_VAR +2]; 

for  (term=0;  term  <  E  orig.nterm;  term  +  +)  { 
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k=  1; 

while  ((X=next_coord(coord,&(E->I[term]),k))  !=  NULL)  { 
vcopy(value,eval(E,X)); 

if  (value[EVAL]  &&  value[EVAL]  <  E  orig.radix)  { 
if  (!value[HLV]) 

Not_all  =  1; 
if  (All_trun)  { 
cf  =  0; 
dea  =  0; 
ea  =  0; 

for  (j=0;  j  <  E_orig.nvar;  j++)  X_orig[j]  =  X[j]; 

/*  for  each  variable  (direction)...  */ 
for  (j=0;  j  <  E_orig.nvar;  j++  )  { 
expanded  =  0; 

/*  If  not  on  a  left  hand  edge,  move  left  */ 
while  (X[j]  >  0)  { 

XD]-; 

vcopy(vall,eval(E,X)); 
if  (vall[EVAL])  { 
expanded  =  1; 
ea+  +  ; 

} 

else  break; 

} 

m  =  X.orig[jl; 
if  (expanded)  { 

expanded  =  0; 
dea++; 

} 

/*  if  we  didn’t  start  on  a  right  hand  edge,  move  right  */ 
while  (X[j]  <  R_l)  { 

X[j]++; 

vcopy(val2,eval(E,X)); 
if(val2[EVAL])  { 
expanded  =  1; 
ea4-  +  ; 

} 

else  break; 

} 

X[j]  =  X_orig[il; 
if  (expanded) 

dea++; 

> 
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r  compute  the  clustering  factor  */ 


cf  =  (dea  *  R_l)  +  ea; 
if  (cf  <  cur_CF)  { 

curval  =  value[EVAL]; 
cur_CF  =  cf; 

for  (i=0;  i  <  E_orig.nvar;  i++)  save_coordl[i]  =  X[i]; 

else  if  (cf  =  =  cur  CF)  { 
cur_val2  =  value[EVAL]; 
cur_CF2  =  cf; 

for  (i=0;  i  <  E_orig.nvar;  i++)  save_coord2[i]  =  X[i]; 

} 

else  { 

cf  =  0; 
dea  =  0; 
ea  =  0; 

for  (j=0;  j  <  E_orig.nvar;  j++)  X_orig[j]  =  X(j]; 
r  for  each  variable  (direction)...  */ 
for  (j=0;  j  <  E_orig.nvar;  j++  )  { 
expanded  =  0; 

/*  If  not  on  a  left  hand  edge,  move  left  */ 
while  (X[j]  >  0)  { 

XD1-; 

vcopy(vall,eval(E,X)); 

if  (vall[EVAL]  &&  (vall[EVAL]  <=  value[EVAL] 

1 1  value[HLV]))  { 
expanded  =  1; 
ea++; 

} 

else 

break; 

} 

X[j]  =  X_orig[j]; 
if  (expanded)  { 
expanded  =  0; 

dea+  +  ; 

} 

I*  if  we  didn’t  start  on  a  right  hand  edge,  move  right  •/ 
while  (X(j]  <  R  1)  { 

XDJ++; 

vcopy(val2,eval(E,X)); 
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if  (val2[EVAL]  &&  (val2[EVAL]  <=  value[EVAL] 
1 1  value[HLV]))  { 
expanded  =  1; 
ea+  +  ; 

} 

else 

break; 

} 

X[j]  =  X_orig[j]; 
if  (expanded) 

dea++; 

} 

/*  compute  the  clustering  factor  *! 

cf  =  (dea  *  R_l)  +  ea; 
if  (!(value[HLV]  &&  cf  >  TRUN))  { 
if  (cf  <  cur_CF)  { 

cur_val  =  value[EVAL]; 
cur_CF  =  cf, 

for  (i=0;  i  <  E_orig.nvar;  i++)  save_coordl[i]  =  X[i]; 

} 

else  if  (cf  =  =  cur_CF)  { 

cur_val2  =  value[EVAL]; 
cur_CF2  =  cf; 

for  (i=0;  i  <  E  orig.nvar;  i++)  save  coord2[i]  =  X[i]; 

} 

} 

} 

} 

k  =  0; 

} 

if  (Hast  &&  (term  =  =  (E_orig.ntenn  -  1))  &&  !Not_all)  { 

All  trun  =  1; 
cur'CF  =  MAX_INT; 
term  =  -1; 
last  =  1; 

} 

} 

if  (cur_CF  ==  MAX_INT) 
retum(NULL); 

save_coordl[E_orig.nvar+ 1]  =  cur_CF; 
save_coordl[E_orig.nvar]  =  cur_val; 
save_coord2[E_orig.nvar+ 1]  =  cur_CF; 


save_coord2[E_orig.nvar]  =  cur_val2; 

if  (cur_CF  !=  cur_CF2)  return(save_coordl); 

else  if  (mynode()  >  =  (mygroup_start  +  mygroup_size/2))  { 
mygroup_start  =  mygroup_start  +inygroup_size/2; 
mygroup_size  =  mygroup_size/2; 
return(save_coord2); 

} 

else  { 

inygroup_end  =  mygroup^start  +  (mygroup_size/2  -1); 
mygroup_size  =  mygroup_size/2; 
return(save_coord  1); 

} 

} 


static  int  valid_iniplicant(l) 

Implicant  *1; 

r - 

ifunction: 

-  Decide  upon  the  validity  of  implicant  I 
-  Local  to  opt_nd_n.c 
:globals: 

E_work 
E_orig 
:side  effects: 

“STAT 

:called_by; 

pick_implicant() 

.calls: 

next_coord() 

eval() 

vcopyO 

:returns: 

1  if  a  valid  implicant 
0  if  not 


int  *X; 
int  init  =  1; 
int  R_1  =  E_orig.radix  -  1; 
int  value  =  I->coeff; 
int  Vo[2],Vw[2]; 
static  int 
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coord[MAX_VAR+2]; 

while  ((X  =  next_coord(cc)ord,I,mit))  !=  NULL)  { 
init  =  0; 

vcopy(Vw,eval(&E_work,X)); 
vcopy(Vo,eval(&E  orig,X)); 

if  (((Vw[EVAL]  <~ value)  &&  !Vw[HLV])  &&  (Vo[EVAL]  <  R_l)) 
retuni(O); 

} 

return(l); 

} 

static  int  compute_rbc(I) 

Implicant  *1; 

/♦ - 

rfunction: 

-  Compute  the  RBC  for  the  given  implicant 
-  Local  to  opt_nd_n.c 

:globaIs: 

radix 

nvar 

rside  effects: 

~STAT 

:called_by: 

pick_implicant() 

:calls: 

next_coord() 

evalQ 

vcopyO 

.returns; 

-  an  integer  RBC 


int  *X; 

int  I_value  =  I->coeff; 
register  i; 
int  value[2], 

R_1  =  E_orig.radK  -  1, 
neighbor_value[2], 
good, 
bad, 
diff, 
equal. 
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neig_boun, 
first, 
rbc  =  0, 
init  =  1; 
static  int 

coord[MAX_VAR+2]; 

t*  for  each  coordinate  in  the  implicant ...  */ 
while  ((X  =  next_coord(coord,I,init))  !=  NULL)  { 
init  =  0; 
equal  =  0; 

vcopy(value,eval(&E_work,X)); 
if  (value[EVAL]  =  =  E_orig.radix) 
continue; 

diff  =  value[EVAL]  -  l  value; 
first  =  1; 

/*  for  each  direction  ...  */ 
for  (i=0;  i  <  E_orig.nvar;  i++)  { 
good  =  0; 
bad  =  0; 

if  ((diff  <=  0)  &&  first)  { 
good  =  2; 
first  =  0; 

} 

r  if  there  is  a  left  neighbor,  examine  it  */ 
if  (X[i]  !=  0  &&  X[ij  ==  I->B[i].lower)  { 

x[i]-; 

vcopy(neighbor_value,eval(&E_work,X)); 
neig_boun  =  neighbor  value[EVAL]  -  value[EVAL]; 
X[i]++; 

if  (neighbor_value[EVAL]  !  =  0)  { 
if  (!neighbor_value[HLV]  1 1  !value[HLV])  { 
if  (neighbor_value[EVAL]  <  diff)  { 
if  (neighbor_value[HLV]) 
good  +=  1; 
else 

bad  +=  2; 

} 

if  (neighbor_value[EVAL]  >  diff)  { 
if  (!neig_boun) 

bad  +=  2; 

if  (neighbor_value[HLV]  &&  neig_boun  <  0) 
bad  +=  2; 
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if  (diff  >  0  &&  neig_boun)  { 
if  (value[HLV]) 
good  +=  1; 
else 

bad  +=  2; 

} 

} 

else  { 

if  (neighbor_value[HLV]  |  (  value[HLV]) 

good  +=  1; 

else 

good  +=  2; 

} 

} 

} 

} 

/*  if  there  is  a  right  neighbor,  examine  it  */ 
if  (X[i]  !=  R_1  &&  X[i]  ==  I->B[i].upper)  { 

X[i]++; 

vcopy(neighbor_value,cval(&E_work,X)); 
neig^boun  =  neighbor  value[EVAL]  -  value[EVAL]; 

X[i]-; 

if  (neighbor_value[EVAL]  !=  0)  { 
if  (1neighbor_value[HLV]  )  |  !value[HLV])  { 
if  (neighbor_value[EVAL]  <  di^  { 
if  (neighbor_value[HLV]) 
good  +=  1; 
else 

bad  +=  2; 

} 

if  (neighbor_value[EVAL]  >  diff)  { 
if  (!neig;_boun) 

bad  +=  2; 

if  (neighbor_value[HLV]  &&  neig_boun  <  0) 
bad  +=  2; 

if  (diff  >  0  &&  neig  boun)  { 
if  (value[HLV]) 
good  +=  1; 
else 

bad  +=  2; 

} 

} 
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else  { 

if  (neighbor_value[HLV]  1 1  value[HLV]) 

good  +=  1; 

else 

good  +=  2; 

} 

} 

} 

} 


/*  update  the  rbc  *! 
rbc  =  (rbc  •  good)  +  bad; 

} 

} 

retuni(rbc); 

} 


static  Implicant  *pick_iinpIicant(X) 
int  *X; 

r - 

.‘function: 

•  Pick  the  best  implicant  for  minterm  X 
.‘globals: 
radix 

.side  effects; 

”STAT 

:called_by: 

Wang_Yang() 

rcaUs: 


{ 


init_implicant() 

gen_bounds() 

next_unplicant() 

eval() 

vcopyO 

compue_rbc() 

copy_implicant() 

valid_implicant() 

:retums: 

-  A  pointer  to  a  term  representing  the  best  implicant. 

- V 


int  cur_rbc  =  MAX^INT, 
rbc  =  0, 
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lvalue, 

i, 

init  =  1, 
first  =  1; 

Impiicant  *1; 
static  int 

coord[MAX_VAR+2]; 
static  Bound  I_bound[MAX_VAR+2]; 
static  Impiicant  I  best; 

Bound  *B; 
int  V[2], 
value[2]; 


I_best.B  =  Ibound; 
init_implicant(X); 

B  =  gen_bounds(X); 
vcopy(V,evai(«feE_orig,X)); 
while  ((I  =  next  implicant(B))  !=  NULL)  { 
if  (V[HLVf)  { 

for  (I->coeff=X[E_orig.nvar];  I*>coeff  <  E  orig.radix;  (I->coefi)'*- +)  { 
if  (valid_implicant(I))  { 

rbc  =  compute_rbc(I); 
if  (first) 
rbc  =  2; 
else 

rbc  +=  2; 

if  (rbc  <  =  cur_rbc)  { 
cur_rbc  =  rbc; 

I->rbc  =  rbc; 
copy_implicant(«feI_best,I); 

} 

} 

} 

first  =  0; 

} 

else  { 

I->coeff  =  X[E_orig.nvar]; 
if  (valid_implicant(I))  { 

rbc  =  compute_rbc(I); 
if  (first)  { 

first  =  0; 
if  (rbc  <  0  ) 
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} 

else 


rbc  =  1; 
else 

rbc  +=  2; 


} 


rbc  +=  2; 

if  (rbc  <  =  cur_rbc)  { 
cur_rbc  =  rbc; 

I->rbc  =  rbc; 
copy_implicant(«feI_best,I); 

} 

} 

} 

} 

retum(&I_best); 


int  *eval(E,X) 

Expression  *E; 
int  *X; 

/* - 

rfunction: 

-  Evaluate  the  expression  at  X,  where  X  is  a  vector  of  coordinates 
:globals: 

nvar 

radix 

:side  effects: 

”STAT 

:called_by: 

mim()  -  pa.c 
valid_iniplicant()  -  pa.c 
pick_implicant()  -  pa.c 
mini()  -  dm.c 
valid_iniplicant()  -  dm.c 
pick  implicant()  -  dm.c 

-CfO' 

compute_rbc() 

gen_bounds() 

print_map() 

rretums: 

-  A  vector  with  the  value  of  the  expression  at  the  specified 
coordinate  as  its  first  element,  and  a  flag  set  if  this  value 
has  attained  the  highest  logic  value  (HLV) 
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int  nterm  =  E->ntenn; 
register 

int  out_of_bounds; 
static  int  V[2]; 

register  nnl  =  E_orig.radix-l; 

V[EVAL]  =  0; 

V[HL^^  =  0; 

/*  for  each  term  ...  */ 
for  (i=0;  i  <  nterm;  i++)  { 

/*  for  each  variable  ...  */ 

for  (j=0,out  of_bounds=0;  j  <  E_orig.nvar;  j++)  { 

iff 

(X[j]  <  E->I[i].B[j].lower)  1 1 
(XQ]  >  E->I[i].B[j].upper) 

){ 

out_of_bounds  =  1; 
break; 

} 

} 

if  (out_of_bounds) 
continue; 

/*  if  this  is  a  don’t  care,  return  the  radix  •/ 
if  (E->I[i].coeff  ==  E_orig.radix)  { 

VjEVAL]  =  E_orig.radix; 
retum(V); 

} 

V[EVAL]  +=  E->I[i].coeff; 
if  (V[EVAL]  >=  rml)  { 

r  set  a  flag  which  means  E_orig  was  saturated  at  this  X  */ 
V[HLV]  =  1; 

} 

if  (V[EVAL]  >  rml)  { 

V[EVAL]  =  rml; 

} 

else  if  (V[HLV]  &&  (V[EVAL]  <=  0))  { 

V[EVAL]  =  E_orig.radix; 
retum(V); 

} 


90 


} 

return(V); 

} 

int  *next_coord(coord,I,first) 
int  *coord; 

Implicant  *1; 
int  first; 

/* - 

:function: 

-  Compute  the  next  possible  coordinate  for  term  *I 

-  If  first  =  =  1,  initialize  the  coord  vector 
:called_by: 

mimQ 

valid_implicant() 

compute_rbc() 

rretums: 

-  An  integer  vector  containing  the  coordinates. 


static  i; 

/*  if  the  first  time  through,  load  the  vector  */ 
if  (first)  { 

for  (i=0;  i  <  E__orig.nvar;  i++)  { 
coord[ij  =  I.>B[i].lower; 

} 

else  { 

i  =  0; 

coord[i]++; 
for  (;;)  { 

if  (coord[i]  >  I->B[i].upper)  { 
coord[i]  =  I->B[i].lower; 
i++; 

if  (i  >  =  E_orig.nvar) 

retum(NULL); 

coord[i]++; 

} 

else  { 

break; 

} 

} 
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} 

retuni(coord); 


Bound  *gen_bounds(X) 
int  *X; 

/* - 

:function: 

-  Generate  the  permissible  bounds  around  location  X  in  the 
working  expression 

rglobals: 

radix 

nvar 

E_work 

E_orig 

:side_effects: 

STAT 

:called_by: 

pick_unplicant() 

xaUs: 

eval() 

vcopyO 

iretums: 

-  A  bounds  array 


static  Bound  B[MAX_VAR+2]; 
int  nterm  =  E_work.ntemi; 
register  i,j,k; 
int  value, Vw[2],Vo[2]; 
int  Xp[MAX_VAR+2]; 

value  =  X[E_orig.nvar]; 

/*  for  each  variable  (direction)...  */ 
for  (i=0;  i  <  E_orig.nvar;  i++  )  { 

/*  dup  the  coordinate  */ 
for  (j=0;  j  <  E_orig.nvar;  j++)  Xpjj]  =  Xjj]; 
B[i].lower  =  X[i]; 

/*  while  not  on  a  left  hand  edge,  move  left  */ 
while  (Xp[i]  >  0)  { 

Xp[i]-; 
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vcopy(Vw,eval(&E_work,Xp)); 
vcopy(Vo,eval(&E_orig,Xp)); 
r  if  can’t  expand  to  left  ....  *! 

if  (!((value  >  Vw[EVAL])  &&  (Vo[EVAL]  <  (E_orig.radix-l))))  { 
B[i].lower  =  Xp[i]; 

} 

else 

break; 

} 

r  dup  the  coordinate  */ 

for  (j=0;  j  <=  (E_orig.nvar+ 1);  j++)  Xp[j]  =  X[j]; 

B[i].upper  =  X[i]; 

/*  while  not  on  a  right  hand  edge,  move  right  */ 
while  (Xp[i]  <  (E_orig.radix-l))  { 

Xp[i]++; 

vcopy(Vw,eval(&E_work,Xp)); 

vcopy(Vo,eval(&E_orig,Xp)); 

I*  if  can’t  expand  to  right ...  */ 

if  (!((value  >  Vw[EVAL])  &&  (Vo[EVAL]  <  (E_orig.radix-l))))  { 
B[i].upper  =  Xp[i]; 

} 

else 

break; 

} 

} 

return  (B); 

} 


I*  Working  structures  for  picking  the  next  implicant  within  bounds  */ 

static  Bound  IB[MAX_VAR+2];/*  Current  bounds  V 
static  Implicant  I;  /*  Implicant  */ 
static  int 
I_var, 

I_first, 

I_val; 

int  X_orig[MAX_VAR+2];/*  Where  we  start  */ 

init  impiicant(X) 
int“  *X; 

/* - 
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:function: 

-  Initialize  the  static  term  structure  above  from  which  successive 

implicants  will  be  returned 

-  X  is  the  starting  minterm 
:side_effects: 

-  The  structures  above 
:called_by: 
pick_implicant() 


int  nterm  =  E_work.nterm; 
register  i; 

/*  initialize  the  implicant  */ 

I.B  =  IB; 

I.coeff  =  X[E_orig.nvar]; 

Lrbc  =  X[E_orig.nvar+l]; 
for  (i=0;  i  <  E_orig.nvar;  i++)  { 

I.B[i].upper  =  X[i]; 

I.B[i].lower  =  X[i]; 

} 

I_var  =  0; 

I_first  =  1; 

I_val  =  X[E_orig.nvar]; 

for  (i=0;  i  <=  (E_orig.nvar+ 1);  i++)  X_orig[i]  =  X[i]; 


Implicant  *next_implicant(B) 

Bound  *B; 

/* - - 

:function: 

-  On  each  call,  return  the  next  implicant  within  bounds  B 
:side  effects: 

“STAT 


:called_by: 

pick_implicant() 

•.returns: 

-  An  implicant  as  a  term  structure 


{ 

int  nterm  =  E_work.nterm; 
int  Xp[MAX_VAR+2]; 


V 
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if  (I_first)  { 

I_first  =  0; 
retun)(&I); 

} 

while  (I_var  <  E_orig.nvar)  { 

/*  expand  left  */ 

I.B[I_var].lower-; 

!*  if  we  can’t  go  further  left,  then  ...  */ 
if  (I.B[I_var].lower  <  B[I_var].lower)  { 

/•  move  back  and  go  right  V 
I.B[I_var].lower  =  X_orig[I_var]; 

I.B[I_varj.upper+  +; 

/*  if  we  can’t  go  further  right,  then  ...  *! 
if  (I.B[I_var].upper  >  B[I_var].upper)  { 

r  reset  and  go  to  the  next  higher  dimension  */ 
I.B[I_var].upper  =  X_orig[I_var]; 

I_var++; 

continue; 

} 

} 

I_var  =  0; 
retum(&I); 

} 

retum(NULL); 

} 

int  copy_implicant(dest,src) 

Implicant  *dest,*src; 

r - 

rfunction: 

-  Copy  the  implicant  pointed  to  by  src  to  dest 
:called_by: 
pick_implicant() 
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register  i; 

dest->coeff  =  src->coeff; 
dest->rbc  =  src->rbc; 
for  (i=0;  i  <  E_orig.nvar;  i++)  { 

dest->B[i].lower  =  src->B[i].lower; 
dest->B[i].upper  =  src->B[i].upper; 

} 


subtract_iinplicant(I) 

Implicant  *1; 

/♦ - 

rfunction: 

•  Add  implicant  I  to  the  working  expression  as  a  negative  term 
(negated  coefficient) 

-  Add  implicant  I  to  tthe  final  expression 
:globals: 

HEUR 

nvar 

:side_effects: 

E  work 

E^finain 


register  i,term; 

term  =  E_work.nterm; 

E_work.  nterm  +  + ; 

E_work,I  =  alloc_implicant(E_work.I,-(I->coeff),E_work.nterm); 
for  (i=0;  i  <  E_orig.nvar;  i++)  { 

E_work.I[term].B[i].lower  =  I->B[i].lower; 
E_work.I[termj.B[ij.upper  =  I->B[i].upper; 

} 

term  =  E_final[HEUR].nterm; 

E_final[HEUR].nterm +  + ; 

E_final[HEUR].I  = 

alloc_implicant(E_final[HEUR].I,I->coeff,E_final[HEUR].nterm); 
for  (i=0;  i  <  E  orig.nvar;  i++)  { 

E_finaliHEUR].I[term].B[i].lower  =  I->B[i].lower; 
E_final[HEUR].I[teim].B[i],upper  =  I->B[i].upper; 
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} 


} 


r  vcopyO 

-  copies  the  value  vector  from  s  to  d 

V 

vcopy(d,s) 
int  *d,*s; 

{ 

d[0]  =  s[0]; 
d[l]  =  s[l]; 

} 

/*  memory  allocation  functions - */ 

Implicant  *alloc_implicant(p,coeff,n) 

Implicant  *p; 
int  coe^n; 

/* - 

:function: 

-  Allocate  space  for  a  term  array,  initializing  the  last  element 

-  If  p  is  NULL,  allocate  new  space 
.Ifp  is  not,  realloc 

'.returns; 

•  A  pointer  to  the  Implicant 


char  *malloc(),*realloc(); 

Bound  *alloc_bound(); 

if  (p  ==  NULL)  { 

if  ((p= (Implicant  *)malloc(sizeof(Implicant)*n))  ==  NULL) 
fatal("alloc_implicant();  out  of  memoiy\n"); 
p->coeff  =  coeff, 
p->B  =  alloc  bound(); 

} 

else  { 

if  ((p= (Implicant  *)realloc(p,sizeof(Implicant)*n))  ==  NULL) 
fatal("alloc_implicant();  out  of  memoiy\n"); 
p[n-l].coeff  =  coeff; 
p[n-l].B  =  alloc  bound(); 

} 
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return(p); 

} 

Bound  ‘*alloc  bouDd() 

/* - - 

:fuDction: 

-  Allocate  space  for  E_orig.nvar  bounds  entries  and  initialize 
each  bound  to  -l,E_orig.radix-l. 

-  If  p  is  NULL,  allocate  new  space 
rglobals: 

E_orig 
:  returns: 

-  A  pointer  to  the  Bound  array 

- V 

{ 

Bound  *p; 
char  *malloc(); 
register  i; 

if  ((p= (Bound  *)malloc(sizeof(Bound)'*(E_orig.nvar)))  ==  NULL) 
fatal("alloc_bound():  out  of  nienioiy\n'’); 

for  (i=0;  i  <  E_orig.nvar;  i++)  { 
p[i].lower  =  -1; 
p[ij.upper  =  E  orig.radix-1; 

} 

retum(p); 

} 


init_expr() 

r - 

:function: 

-  Initialize  E_work,  E_orig  and  E_final 
:side_effects: 

E_work 

E_orig 

E_final 


E_work.I  =  NULL; 
E_orig.I  =  NULL; 
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E_orig.nvar  =  0; 

E_orig.iitenn  =  0; 

E_orig.radix  =  0; 

E_final[0].I  =  NULL; 

E_finaI[ll.I  =  NULL; 

} 

dealloc_expr(e) 

Expression  *e; 

I* - 

rfunction: 

-  Deallocate  the  expression  pointed  to  by  e 


Implicant  *p; 
register  i; 

if  (e->I  !=  NULL)  { 

for  (p  =  e->I,i=0;  i  <  e->nterm;  i++) 
if  (p[i].B  !=  NULL)  { 
free(p[i].B); 
p[i].B  =  NULL; 

} 

free(p); 

e->I  =  NULL; 

} 

e->nvar  =  0; 
e->ntenn  =  0; 
e->radix  =  0; 

} 


dup_expr(E_dest,E_src) 

Expression  *E_dest; 
nisg_expression  *E_src; 

/* - 

.'function: 

-  Duplicate  the  expression  pointed  to  by  E_src  by  allocating  as 
necessary  and  copying  into  the  expression  pointed  to  by  Eldest. 

-  If  E  dest  can  contain  E_src,  no  reallocation  is  performed  (this 

test  is  made  bv  comparing  nvar  and  nterm  parameters,  and  by  testing 
pointers  against  NULL) 
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rcalis; 

allcx;_bound() 


Implicaot  *1; 

Bound  *B; 
register  i,j; 
char  *malloc(); 
int 

nterm  =  E_dest->ntemi, 
nvar  =  E_dest->nvar; 

if  (nterm  !=  E_src->  nterm)  { 
if  (E_dest->I  !=  NULL) 
dealloc_expr(E_dest); 

} 

E_dest->  radix  =  E_src->  radix; 

E_dest->nvar  =  E_src->nvar; 

E_dest->  nterm  =  E_src->  nterm; 

if  (E_dest->I  ==  NULL)  { 

if  ((I=(Implicant  *)malloc(sizeof(Implicant)*(E  dest-> nterm))) 

NULL) 

fatal ("dup_expr():  out  of  memory\n"); 
for  (i=0;  i  <  E  src*>nterm;  i++) 

I[i].B  =  NULL; 

E  dest->I  =  I; 

} 

else 

I  =  E_dest->I; 

for  (i=0;  i  <  E_src->ntenn;  i++)  { 

I[i].coeff  =  E_src->l[i].coeff; 

if  ((E_orig.nvar  !=  E_src->nvar)  1 1  (I[i].B  ==  NULL))  { 

I[i].B  =  alloc  bound(); 

} 

for  (j=0;  j  <  E_src->nvar;  j++)  { 

I[i].B|j].lower  =  E_src->I[i].B[j].lower; 

I[i].B(j].upper  =  E  src->I[i].B|j].upper; 

} 

} 

} 


100 


static  struct  tms 
Tl,T2,Tla,T2a; 
resource__used(op) 

static  call  =  0; 
if  (op  ==  START) 

times(call=  =0?&Tl:&Tla); 

else 

times(call=  =  l?&T2:&T2a); 

if  (++call  >  1) 
call  =  0; 

} 

#ifndef  HZ 
#de£ine  HZ  60 
#endif 


long  secs_used() 

retum((T2.tins__utime-Tl.tms_utime)/HZ  ); 

} 

long  tsecs  used() 

{ 

^  retum((((T2.tms_utime-Tl.tnis_utinie)  %HZ)  *  10001)  /HZ); 

fatal(s) 
char  *s; 

{ 

fyrintf(stderr,"%s\n",s); 

exit(l); 

} 

print_niap() 

{ 

register  i,j; 

int  X[MAX_VAR+2]; 
int  *V; 

for  (i=0;  i  <  E_orig.nvar;  i+  +  )  X[i]  =  0; 
for  (i=0;  i  <  E_orig.nvar;)  { 
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V  =  eval(«S:E_work,X); 

sprintf(msg,"%s%3d%c",X[i]==0?"  ":"",V[EVAL],V[HLV]?’.’:’  ’); 
cwrite(fd,msg,strlen(msg)); 

X[i]++; 

for  (;i  <  E_orig.nvar;)  { 

if  (X[i]  >  =  E_orig.radix)  { 

X[i]  =  0; 
if  (i  <  2) 

sprintf(msg,"\n"); 

cwrite(fd,msg,strlen(msg)); 

i++; 

X[i]++; 

} 

else  { 

i  =  0; 
break; 

} 

} 

} 

} 

print  implicant(X,I) 
int  ■  *X; 

Implicant  *1; 

/* - 

:function: 

-  Print  the  Most  Isolated  Minterm  X  and  the  implicant  selected 
to  cover  it  I. 

:called_by: 

mainQ 

- V 

{ 

register  i; 

if  (X  !=  NULL)  { 

sprintf(msg,"  MIM:  (%d)  %2d",X[E_orig.nvar+l],X[E_orig.nvar]); 
cwrite(fd,msg,strlen(msg)); 
for  (i=0;  i  <  E_orig.nvar;  i++)  { 
sprintf(msg,"*X%d(%2d)",i+l,X[i]); 
cwrite(fd,msg,strlen(msg)); 

} 

sprintf(msg,"\n"); 

cwrite(fd,msg,strlen(msg)); 
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sprintf(insg,"  Imp:  (%d)  %2d",I->rbc,I->coe£f); 

cwrite(fd,msg,strlen(msg)); 

for  (i=0;  i  <  E_orig.nvar;  i++)  { 

sprintf(msg,"*X%d(%2d,%2d)",i+ 1,1-  >  B[i].lower,I-  >  B[i].upper); 
cwrite(fd,msg,strlen(msg)); 

} 

sprintf(msg,”\n\n"); 

cwrite(fd,msg,strlen(msg)); 


I 
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APPENDIX  C:  TIME  COMPARISON  TABLES 


TABLE  C.l:  TWO  VARIABLE  FOUR  VALUED  TIME  COMPARISON 


Number  of 
Input  Terms 

Computation  Time 
for  Sequential 
Algorithm(secs.) 

Computation  Time 
for  Parallel 
Algori*hm(secs.) 

Ratio 

5 

0.3293 

0.2510 

1.3120 

10 

1.0167 

0.6420 

1.5836 

15 

1.6253 

0.9933 

1.6363 

20 

2.0710 

1.1357 

1.8235 

25 

3.0437 

1.6083 

1.8925 

30 

3.4947 

1.3793 

2.5337 

35 

3.4890 

1.2433 

2.8062 

40 

4.6900 

1.7583 

2.6673 

45 

5.7493 

2.0900 

2.7509 

50 

6.7473 

2.4200 

2.7881 
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TABLE  C.2:  THREE  VARIABLE  FOUR  VALUED  TIME  COMPARISON 


Number  of 
Input  Terms 

Computation  Time 
for  Sequential 
Algorithm  (secs.) 

Computation  Time 
for  Parallel 
Algorithm  (secs.) 

Ratio 

5 

1.5587 

1.0906 

1.4222 

10 

7.1123 

4.5500 

1.5631 

15 

14.2383 

9.2657 

1.5367 

20 

22.6060 

14.4527 

1.5641 

25 

34.0607 

20.3773 

1.6715 

30 

40.5067 

24.3210 

1.6655 

35 

50.4833 

30.2620 

1.6682 

40 

61.6193 

37.3053 

1.6518 

45 

69.1513 

41.3650 

1.6717 

50 

73.1720 

42.9803 

1.7025 

55 

75.3140 

43.4327 

1.7340 

60 

76.6263 

42.5230 

1.8020 

65 

78.4003 

41.2967 

1.8985 

70 

79.2020 

40.8500 

1.9388 
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TABLE  C.3:  FOUR  VARUBLE  FOUR  VALUED  TIME  COMPARISON 


Number  of 
Input  Terms 

Computation  Time 
for  Sequential 
Algorithm  (secs.) 

Computation  Time 
for  Parallel 
Algorithm  (secs.) 

Ratio 

5 

6.3707 

4.8860 

1.3039 

10 

28.5067 

17.9700 

1.5863 

15 

67.2783 

40.4720 

1.6623 

20 

130.1080 

72.9250 

1.7841 

25 

208.7533 

114.6147 

1.8213 

30 

311.2900 

183.1463 

1.6997 

35 

400.2017 

234.6913 

1.7052 

TABLE  C.4:  FIVE  VARIABLE  FOUR  VALUED  TIME  COMPARISON 


Number  of 
Input  Terms 

Computation  Time 
for  Sequential 
Algorithm  (secs.) 

Computation  Time 
for  Parallel 
Algorithm  (secs.) 

Ratio 

5 

145 

112 

1.3 

10 

796 

518 

1.5 

15 

2111 

1207 

1.7 

20 

4298 

2257 

1.9 

25 

7876 

3998 

2.0 

30 

12048 

5857 

2.1 

35 

16406 

7447 

2.2 

APPENDIX  D:  SOLUTION  SETS  FOR  EXAMPLE  6 
SOLUTION  FROM  ND  ALGORITIHM 

Orig  map  (W&Y): 

1111 
3.  3.  3.  3. 

1  2  3.  2 
1  2  3.  2 

MIM:  (4)  2*X1(  3)*X2(  2) 

Imp:  (-9)  2*X1(  1,  3)*X2(  1,  3) 

1111 
3.  1.  1.  1. 

1  0  1.  0 
1  0  1.  0 

MIM:  (4)  1*X1(  0)*X2(  2) 

Imp:  (-2)  1*X1(  0,  0)*X2(  0,  3) 

0  111 

2.  1.  1.  1. 

0  0  1.  0 
0  0  1.  0 

MIM:  (6)  1*X1(  2)*X2(  3) 

Imp:  (-2)  1*X1(  2,  2)*X2(  0,  3) 

0  10  1 
2.  1.  4.  1. 

0  0  4.  0 
0  0  4.  0 

MIM:  (4)  1*X1(  1)*X2(  0) 

Imp:  (-2)  1*X1(  1, 1)*X2(0,  1) 

0  0  0  1 
2.  4.  4.  1. 

0  0  4.  0 
0  0  4.  0 
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MIM:  (4)  1*X1(  3)*X2(  0) 

Imp:  (-2)  1*X1(  3,  3)*X2(  0, 1) 

0  0  0  0 
2.  4.  4.  4. 

0  0  4.  0 
0  0  4.  0 

MIM:  (6)  2*X1(  0)*X2(  1) 

Imp:  (0)  3*X1(  0,  3)*X2(  1,  1) 

0  0  0  0 
4.  4.  4.  4. 

0  0  4.  0 
0  0  4.  0 


1  W&Y:  6/10  0.60  0:640 
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SOLUTION  FROM  MCND  NODE  #0  AND  #1 
Orig  mapCOPT  ND); 

1  1  1  1  " 

3.  3.  3.  3. 

1  2  3.  2 
1  2  3.  2 

MIM:  (4)  2*X1(  3)*X2(  2) 

Imp:  (-9)  2*X1(  1,  3)*X2(  1,  3) 

1111 
3.  1.  1.  1. 

1  0  1.  0 
1  0  1.  0 

MIM:  (4)  1*X1(  0)*X2(  2) 

Imp:  (-2)  1*X1(  0,  0)*X2(  0,  3) 

0  111 

2.  1.  1.  1. 

0  0  1.  0 
0  0  1.  0 

MIM:  (6)  1*X1(2)*X2(3) 

Imp:  (-2)  1*X1(  2,  2)*X2(  0,  3) 

0  10  1 
2.  1.  4.  1. 

0  0  4.  0 
0  0  4.  0 

MIM:  (4)  1*X1(  3)*X2(  0) 

Imp:  (-2)  1*X1(  3,  3)*X2(  0,  1) 

0  10  0 
2.  1.  4.  4. 

0  0  4.  0 
0  0  4.  0 


MIM:  (4)  1*X1(  1)*X2(  0) 

Imp:  (-2)  1*X1(  1,  1)*X2(  0,  1) 


0  0  0  0 
2.  4.  4.  4. 

0  0  4.  0 
0  0  4.  0 

MIM:  (6)  2*X1(  0)*X2(  1) 
Imp:  (0)  3*X1(  0,  3)*X2(  1,  1) 

0  0  0  0 
4.  4.  4.  4. 

0  0  4.  0 
0  0  4.  0 


1  OPT_PAR;  6/10  0.60  11:915  From  node:  0,1 


! 


no 


SOLUTION  FROM  MCND  NODE  #2  AND  #3 
Orig  map(OPT  ND): 

1111“ 

3.  3.  3.  3. 

1  2  3.  2 
1  2  3.  2 

MIM:  (4)  2*X1(  3)*X2(  2) 

Imp:  (-9)  2*X1(  1,  3)*X2(  1,  3) 

1111 

3.  1.  1.  1. 

1  0  1.  0 
1  0  1.  0 

MIM:  (4)  1*X1(0)*X2(3) 

Imp:  (-2)  1*X1(  0,  0)*X2(  0,  3) 

0  111 

2.  1.  1.  1, 

0  0  1.  0 
0  0  1.  0 

MIM:  (6)  2*X1(  0)*X2(  1) 

Imp:  (0)  3*X1(  0,  3)*X2(  1, 1) 

0  111 

4.  4.  4.  4. 

0  0  1.  0 
0  0  1.  0 

MIM:  (5)  1*X1(3)*X2(0) 

Imp:  (-4)  1*X1(  1,  3)*X2(  0,  1) 

0  0  0  0 
4.  4.  4.  4. 

0  0  1.  0 
0  0  1.  0 

MIM:  (5)  1*X1(  2)*X2(  3) 

Imp:  (-2)  3*X1(  2,  2)*X2(  1,  3) 


111 


5/10  0.50  11:241  From  node:  2^ 


0  0  0  0 
4.  4.  4.  4. 
0  0  4.  0 
0  0  4.  0 

1  OPT_PAR: 


112 


SOLUTION  FROM  MCND  NODE  #4  THROUGH  #7 
Orig  map(OPT  ND); 

1  1  1  1  " 

3.  3.  3.  3. 

1  2  3.  2 
1  2  3.  2 

MIM:  (4)  1*X1(  0)*X2(  3) 

Imp:  (-10)  1*X1(  0,  3)*X2(  0,  3) 

0  0  0  0 

2.  2.  2.  2. 

0  1  2.  1 
0  1  2.  1 

MIM:  (4)  1*X1(  1)*X2(  2) 

Imp:  (-6)  1*X1(  1,  3)*X2(  1,  3) 

0  0  0  0 

2.  1.  1.  1. 

0  0  1.  0 
0  0  1.  0 

MIM:  (5)  1*X1(  2)*X2(  3) 

Imp:  (-4)  3*X1(  2,  2)*X2(  1,  3) 

0  0  0  0 
2.  1.  4.  1. 

0  0  4.  0 
0  0  4.  0 

MIM:  (6)  1*X1(  3)*X2(  1) 

Imp:  (-4)  3*X1(  0,  3)*X2(  1,  1) 

0  0  0  0 
4.  4.  4.  4. 

0  0  4.  0 
0  0  4.  0 


1  OPT_PAR:  4/10  0.40  10:014  From  node:  4^, 6, 7 
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